<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Aaron Glatzer, Author at Be on the Right Side of Change</title>
	<atom:link href="https://blog.finxter.com/author/aaronglatzer/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.finxter.com/author/aaronglatzer/</link>
	<description></description>
	<lastBuildDate>Mon, 09 Jan 2023 19:06:13 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.finxter.com/wp-content/uploads/2020/08/cropped-cropped-finxter_nobackground-32x32.png</url>
	<title>Aaron Glatzer, Author at Be on the Right Side of Change</title>
	<link>https://blog.finxter.com/author/aaronglatzer/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Using PyTorch to Build a Working Neural Network</title>
		<link>https://blog.finxter.com/using-pytorch-to-build-a-working-neural-network/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Fri, 18 Nov 2022 20:19:12 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Deep Learning]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[PyTorch]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=903641</guid>

					<description><![CDATA[<p>In this article, we will use PyTorch to build a working neural network. Specifically, this network will be trained to recognize handwritten numerical digits using the famous MNIST dataset. The code in this article borrows heavily from the PyTorch tutorial &#8220;Learn the Basics&#8221;. We do this for several reasons. Knowledge Background This article assumes the ... <a title="Using PyTorch to Build a Working Neural Network" class="read-more" href="https://blog.finxter.com/using-pytorch-to-build-a-working-neural-network/" aria-label="Read more about Using PyTorch to Build a Working Neural Network">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/using-pytorch-to-build-a-working-neural-network/">Using PyTorch to Build a Working Neural Network</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[<div class="wp-block-image">
<figure class="aligncenter size-large"><img fetchpriority="high" decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-201-1024x682.png" alt="" class="wp-image-904029" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-201-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-201-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-201-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-201.png 1255w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>In this article, we will use <a rel="noreferrer noopener" href="https://blog.finxter.com/pytorch-developer-income-and-opportunity/" data-type="post" data-id="255891" target="_blank">PyTorch</a> to build a working neural network. Specifically, this network will be trained to recognize handwritten numerical digits using the famous MNIST dataset.</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe title="Using PyTorch to Build a Working Neural Network" width="937" height="527" src="https://www.youtube.com/embed/e02w3bKhFe8?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p>The code in this article borrows heavily from the PyTorch tutorial <a href="https://pytorch.org/tutorials/beginner/basics/intro.html#learn-the-basics" target="_blank" rel="noreferrer noopener">&#8220;Learn the Basics&#8221;</a>. We do this for several reasons. </p>



<ul class="wp-block-list">
<li>First, that tutorial is pretty good at demonstrating the essentials for getting a working neural network. </li>



<li>Second, just like importing libraries, it&#8217;s good to not reinvent the wheel when you don&#8217;t have to. </li>



<li>Third, when building your own network, it is very helpful to start with something that is known to work, then modify it to your needs.</li>
</ul>



<h2 class="wp-block-heading">Knowledge Background</h2>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-202-1024x682.png" alt="" class="wp-image-904030" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-202-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-202-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-202-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-202.png 1255w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>This article assumes the reader has some necessary background:</p>



<ol class="wp-block-list">
<li>Familiarity with <a href="https://blog.finxter.com/python-crash-course/" data-type="post" data-id="3951" target="_blank" rel="noreferrer noopener">Python</a>, and Python <a href="https://blog.finxter.com/introduction-to-python-classes/" data-type="post" data-id="30596" target="_blank" rel="noreferrer noopener">object-oriented programming</a>.</li>



<li>Familiarity with how neural networks work. See the Finxter article <a href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/" target="_blank" rel="noreferrer noopener">&#8220;The Magic of Neural Networks: History and Concepts&#8221;</a> to learn the basic ideas.</li>



<li>Familiarity with how neural networks learn. See the Finxter article <a href="https://blog.finxter.com/how-neural-networks-learn/" target="_blank" rel="noreferrer noopener">&#8220;How Neural Networks Learn&#8221;</a> to learn this subject.</li>



<li>Familiarity with tensors. See the Finxter article <a href="https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/" target="_blank" rel="noreferrer noopener">&#8220;Tensors: the Vocabulary of Neural Networks&#8221;</a> to learn this subject.</li>



<li>Familiarity with <a href="https://blog.finxter.com/matplotlib-full-guide/" data-type="post" data-id="20151" target="_blank" rel="noreferrer noopener">Matplotlib</a>. While this is not necessary to follow along, it is necessary if you want to be able to view image data yourself on your own datasets in the future (and you <em>will</em> want to be able to do this).</li>
</ol>



<p>You can run PyTorch on your own machine, or you can run it on publically available computer systems. </p>



<p>We will be running this exercise using Google Colab, which allows running world-class computing capability, all accessible for free. </p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f30d.png" alt="🌍" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended</strong>: Other options for publically available computing are shown in the Finxter article <a rel="noreferrer noopener" href="https://blog.finxter.com/survey-of-python-online-notebook-options/" target="_blank">&#8220;Top 4 Jupyter Notebook Alternatives for Machine Learning&#8221;</a>.</p>



<h2 class="wp-block-heading">Process Overview</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="625" height="938" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-203.png" alt="" class="wp-image-904033" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-203.png 625w, https://blog.finxter.com/wp-content/uploads/2022/11/image-203-200x300.png 200w" sizes="auto, (max-width: 625px) 100vw, 625px" /></figure>
</div>


<p>This article will cover all the necessary steps to build and test a working neural network using the <a href="https://blog.finxter.com/how-to-install-pytorch-on-pycharm/" data-type="post" data-id="35142" target="_blank" rel="noreferrer noopener">PyTorch library</a>. </p>



<p>PyTorch provides a framework that makes building, training, and using <a rel="noreferrer noopener" href="https://blog.finxter.com/tutorial-how-to-create-your-first-neural-network-in-1-line-of-python-code/" data-type="post" data-id="2463" target="_blank">neural netwo</a>rks easier. Also under the hood, it is written using the very fast <a rel="noreferrer noopener" href="https://blog.finxter.com/c-plus-plus-developer-income-and-opportunity/" data-type="post" data-id="196896" target="_blank">C++</a> language, so that those neural networks can provide world-class performance while using the popular Python language as the interface to create those networks.</p>



<p>Neural networks and the PyTorch library are rich subjects. So while we will cover all the necessary steps, each step will just scratch the surface of its respective subject. </p>



<p>For example, we will get the image data from datasets built into the PyTorch library. However, the user will eventually want to use neural networks on their own data, so the users will need to learn how to build and work with their own datasets. </p>



<p>So for each of these steps, the user will want to learn more on each subject to become a proficient PyTorch user.</p>



<p>Nevertheless, by the end of this article, you will have built your own working neural network, so you can be sure you will know how to do it! </p>



<p>Further learning will enrich those abilities. Throughout the article, we will point out some of the other things you will eventually want to learn for each step.</p>



<p>Here are the steps we will be taking:</p>



<ol class="wp-block-list">
<li>Import necessary libraries.</li>



<li>Acquire the data.</li>



<li>Review the data to understand it.</li>



<li>Create data loaders for loading the data into the network.</li>



<li>Design and create the neural network.</li>



<li>Specify the loss measure and the optimizer algorithm.</li>



<li>Specify the training and testing functions.</li>



<li>Train and test the network using the specified functions.</li>
</ol>



<h2 class="wp-block-heading">Step 1: Import Necessary Libraries<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.3znysh7"></a></h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-204-1024x682.png" alt="" class="wp-image-904037" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-204-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-204-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-204-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-204.png 1255w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Before we do anything, we will want to set up our runtime to use the GPU (again, assuming here you are using Colab). </p>



<p>Click on <strong>&#8220;Runtime&#8221;</strong> in the top menu bar, and then choose <strong>&#8220;Change runtime type&#8221;</strong> from the dropdown. Then from the window that pops up choose <strong>&#8220;GPU&#8221;</strong> under <strong>&#8220;Hardware accelerator&#8221;</strong>, and then click <strong>&#8220;Save&#8221;</strong>.</p>



<p>Next, we will need to import a number of libraries:</p>



<ol class="wp-block-list">
<li>We will import the <code>torch</code> library, making PyTorch available for use.</li>



<li>From the <code>torch</code> module we will import the <code>nn</code> library, which is important for building the neural network.</li>



<li>From the <code>torchvision</code> module we will import the <code>datasets</code> library, which will help provide the image datasets.</li>



<li>From the <code>data</code> utilities module, we will import the <code>DataLoader</code> library. Data loaders help load data into the network.</li>



<li>From the <code>torchvision.transforms</code> module we will import the <code>ToTensor</code> library. This converts the image data into tensors so that they are ready to be processed through the network.</li>
</ol>



<p>Here is the code importing the needed modules:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import torch
from torch import nn
from torchvision import datasets
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor</pre>



<h2 class="wp-block-heading">Step 2: Acquire the Data</h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-205-1024x682.png" alt="" class="wp-image-904041" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-205-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-205-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-205-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-205.png 1255w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>As mentioned before, in this exercise, we will be getting the MNIST data as available directly through PyTorch libraries. This is the quickest and easiest approach to getting the data.</p>



<p>If you wanted to get the original datasets they are available at:</p>



<p><a href="http://yann.lecun.com/exdb/mnist/" target="_blank" rel="noreferrer noopener">http://yann.lecun.com/exdb/mnist/</a></p>



<p>Even though we will get the data through the PyTorch libraries, it can still be helpful to review this page, as it provides some useful information about the dataset. (However we will provide everything you need to understand this dataset in the article).</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: Firefox has trouble accessing this page, for some reason requiring a login to access it. Either view it using another browser, or view it as recorded on the Internet Archive Wayback Machine.</p>



<p>There are multiple datasets available through the PyTorch dataset libraries. Here are PyTorch webpages linking to <a href="https://pytorch.org/vision/stable/datasets.html" target="_blank" rel="noreferrer noopener">Image Datasets</a>, <a href="https://pytorch.org/text/stable/datasets.html" target="_blank" rel="noreferrer noopener">Text Datasets</a>, and <a href="https://pytorch.org/audio/stable/datasets.html" target="_blank" rel="noreferrer noopener">Audio Datasets</a>.</p>



<p>To get data from a PyTorch dataset we create an instance from the respective dataset class. Here is the format:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">dataset_instance = DatasetClass(parameters)</pre>



<p>This creates a dataset object, and downloads the data. The data is then available by working with the dataset object.</p>



<p>Here is the code to create our MNIST datasets:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Download MNIST data, put it in pytorch dataset
mnist_data = datasets.MNIST(
    root='mnist_nn',
    train=True,
    download=True,
    transform=ToTensor()
)
</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">mnist_test_data = datasets.MNIST(
    root='mnist_nn',
    train=False,
    download=True,
    transform=ToTensor()
)</pre>



<p>You&#8217;ll use these parameters:</p>



<ul class="wp-block-list">
<li>The <code>root</code> parameter specifies the directory where the downloaded data will be placed. </li>



<li>The <code>train</code> parameter determines whether training or testing data is downloaded. </li>



<li>The <code>download=True</code> parameter confirms the data should be downloaded if it hasn&#8217;t been already. </li>



<li>The <code>transform</code> parameter converts the data into <a href="https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/" data-type="post" data-id="616223" target="_blank" rel="noreferrer noopener">tensors</a>, in this case.</li>
</ul>



<p>What parameters are available vary from dataset to dataset, as does how the data is structured, so refer to the dataset web pages mentioned above to review the details of what is available and needed.</p>



<p>While this method of getting data is convenient and easy, remember that you will eventually want to work with your own data, so eventually, you will want to learn how to create your own datasets.</p>



<p>Also, not all datasets contain images with uniform image size, so images may need to be cropped or stretched to fit the fixed number of input neurons. </p>



<p>Also, other transformations can be helpful as well. </p>



<p>For example, you can effectively expand your dataset by including <code>subcrops</code> from your original dataset as additional images to train on. So data transformations is something else you will want to learn that you might use at this stage in the process.</p>



<h2 class="wp-block-heading">Step 3: Review the Dataset</h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-206-1024x682.png" alt="" class="wp-image-904043" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-206-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-206-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-206-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-206.png 1255w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Now that we have downloaded the data and created a dataset, let&#8217;s review the dataset to understand its contents and structure.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">type(mnist_data)
# torchvision.datasets.mnist.MNIST</pre>



<p></p>



<p>The <code><a href="https://blog.finxter.com/python-type/" data-type="post" data-id="23967" target="_blank" rel="noreferrer noopener">type()</a></code> function shows that our dataset is an object of the MNIST dataset class.</p>



<p>Conveniently, PyTorch datasets have been designed to be indexed like lists. Let&#8217;s take advantage of this and use the <code><a href="https://blog.finxter.com/python-len/" data-type="post" data-id="22386" target="_blank" rel="noreferrer noopener">len()</a></code> function to learn something about our datasets:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">len(mnist_data)
# 60000</pre>



<p></p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">len(mnist_test_data)
# 10000</pre>



<p>So our training dataset contains 60000 items, and our test dataset contains 10000 items, consistent with the number of images specified to be in each respective dataset.</p>



<p>Let&#8217;s use the <code>type()</code> and <code>len()</code> functions to examine the first item in the training dataset:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">type(mnist_data[0])
# tuple</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">len(mnist_data[0])
# 2</pre>



<p></p>



<p>So the items in the datasets are tuples containing 2 items.</p>



<p>Let&#8217;s use the <code>type()</code> function to learn about the first item in the <a rel="noreferrer noopener" href="https://blog.finxter.com/the-ultimate-guide-to-python-tuples/" data-type="post" data-id="12043" target="_blank">tuple</a>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">type(mnist_data[0][0])
# torch.Tensor</pre>



<p>So the first item in the tuple is a tensor, likely some image data.</p>



<p>Let&#8217;s examine the shape attribute of the tensor to understand its shape:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">mnist_data[0][0].shape
# torch.Size([1, 28, 28])</pre>



<p>This is consistent with the 28*28 pixel structure of the image data, plus one additional dimension containing the entire image data.</p>



<p>Let&#8217;s examine the second item in the tuple:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">type(mnist_data[0][1])
# int</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">mnist_data[0][1]
# 5</pre>



<p></p>



<p>So the second item is the integer <code>'5'</code>, apparently the label for an image of the digit <code>'5'</code>.</p>



<p>Let&#8217;s use Matplotlib to view the image:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import matplotlib.pyplot as plt
plt.imshow(mnist_data[0][0], cmap='gray')</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">TypeError                                 Traceback (most recent call last)
&lt;ipython-input-14-3e7278364eac> in &lt;module>
----> 1 plt.imshow(mnist_data[0][0], cmap='gray')

/usr/local/lib/python3.7/dist-packages/matplotlib/pyplot.py in imshow(X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, shape, filternorm, filterrad, imlim, resample, url, data, **kwargs)
   2649         filternorm=filternorm, filterrad=filterrad, imlim=imlim,
   2650         resample=resample, url=url, **({"data": data} if data is not
-> 2651         None else {}), **kwargs)
   2652     sci(__ret)
   2653     return __ret

/usr/local/lib/python3.7/dist-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
   1563     def inner(ax, *args, data=None, **kwargs):
   1564         if data is None:
-> 1565             return func(ax, *map(sanitize_sequence, args), **kwargs)
   1566 
   1567         bound = new_sig.bind(ax, *args, **kwargs)

/usr/local/lib/python3.7/dist-packages/matplotlib/cbook/deprecation.py in wrapper(*args, **kwargs)
    356                 f"%(removal)s.  If any parameter follows {name!r}, they "
    357                 f"should be pass as keyword, not positionally.")
--> 358         return func(*args, **kwargs)
    359 
    360     return wrapper

/usr/local/lib/python3.7/dist-packages/matplotlib/cbook/deprecation.py in wrapper(*args, **kwargs)
    356                 f"%(removal)s.  If any parameter follows {name!r}, they "
    357                 f"should be pass as keyword, not positionally.")
--> 358         return func(*args, **kwargs)
    359 
    360     return wrapper

/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_axes.py in imshow(self, X, cmap, norm, aspect, interpolation, alpha, vmin, vmax, origin, extent, shape, filternorm, filterrad, imlim, resample, url, **kwargs)
   5624                               resample=resample, **kwargs)
   5625 
-> 5626         im.set_data(X)
   5627         im.set_alpha(alpha)
   5628         if im.get_clip_path() is None:

/usr/local/lib/python3.7/dist-packages/matplotlib/image.py in set_data(self, A)
    697                 or self._A.ndim == 3 and self._A.shape[-1] in [3, 4]):
    698             raise TypeError("Invalid shape {} for image data"
--> 699                             .format(self._A.shape))
    700 
    701         if self._A.ndim == 3:

TypeError: Invalid shape (1, 28, 28) for image data
</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="314" height="302" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-255.png" alt="" class="wp-image-914864" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-255.png 314w, https://blog.finxter.com/wp-content/uploads/2022/11/image-255-300x289.png 300w" sizes="auto, (max-width: 314px) 100vw, 314px" /></figure>
</div>


<p>Oops, that extra one-item dimension (containing the whole image) is causing us problems. We can use the <code>squeeze()</code> method on the tensor to get rid of any one-element dimensions, and instead return a two-dimensional 28*28 tensor, instead of the three-dimensional tensor we had before.</p>



<p>Let&#8217;s try again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">plt.imshow(mnist_data[0][0].squeeze(), cmap='gray')
# &lt;matplotlib.image.AxesImage at 0x7f5b5e336150></pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="491" height="294" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-256.png" alt="" class="wp-image-914865" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-256.png 491w, https://blog.finxter.com/wp-content/uploads/2022/11/image-256-300x180.png 300w" sizes="auto, (max-width: 491px) 100vw, 491px" /></figure>
</div>


<p>Well, it&#8217;s a little sloppy, but that&#8217;s plausibly a number <code>'5'</code>. (This is reasonable to expect from a hand-written digit!).</p>



<p>So it looks like each item in the dataset is a tuple containing an image (in tensor format) and its corresponding label.</p>



<p>Let&#8217;s use Matplotlib to look at the first 10 images, and title each image with its corresponding label:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">fig, axs = plt.subplots(2, 5, figsize=(8, 5))
for a_row in range(2):
  for a_col in range(5):
    img_no = a_row*5 + a_col
    img = mnist_data[img_no][0].squeeze()
    img_tgt = mnist_data[img_no][1]
    axs[a_row][a_col].imshow(img, cmap='gray')
    axs[a_row][a_col].set_xticks([])
    axs[a_row][a_col].set_yticks([])
    axs[a_row][a_col].set_title(img_tgt, fontsize=20)
plt.show()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="568" height="294" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-257.png" alt="" class="wp-image-914867" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-257.png 568w, https://blog.finxter.com/wp-content/uploads/2022/11/image-257-300x155.png 300w" sizes="auto, (max-width: 568px) 100vw, 568px" /></figure>
</div>


<p>So now we have a clear understanding of how our dataset is structured and what the data looks like. Much of this is explained in the dataset description page, but this kind of analysis is often very useful for getting a precise understanding of the dataset that might not be clear from the description.</p>



<h2 class="wp-block-heading">Step 4: Create Dataloaders</h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="768" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-207-1024x768.png" alt="" class="wp-image-904045" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-207-1024x768.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-207-300x225.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-207-768x576.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-207.png 1251w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Datasets make the data available for processing. </p>



<p>However, typically, we will want to process using randomized mini-batches from the dataset. </p>



<p>Data loaders make this easy. Dataloaders are <a href="https://blog.finxter.com/iterators-iterables-and-itertools/" data-type="post" data-id="29507" target="_blank" rel="noreferrer noopener">iterables</a>, and you&#8217;ll see later that every time you iterate a dataloader it returns a randomized minibatch from the dataset that can be processed through the neural network.</p>



<p>Let&#8217;s create some dataloader objects from our datasets:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">batch_size = 100

mnist_train_dl = DataLoader(mnist_data,
                      batch_size=batch_size,
                      shuffle=True)

mnist_test_dl = DataLoader(mnist_test_data,
                          batch_size=batch_size,
                          shuffle=True)</pre>



<p>So we have created two data loaders, one for the training dataset, and one for the test dataset. </p>



<p>The <code>batch_size</code> parameter specifies the number of image/label pairs in the minibatch that the dataloader will return for each iteration. The <code>shuffle</code> parameter determines whether or not the mini-batches are randomized.</p>



<h2 class="wp-block-heading">Step 5: Design and Create the Neural Network</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="625" height="938" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-208.png" alt="" class="wp-image-904047" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-208.png 625w, https://blog.finxter.com/wp-content/uploads/2022/11/image-208-200x300.png 200w" sizes="auto, (max-width: 625px) 100vw, 625px" /></figure>
</div>


<h3 class="wp-block-heading">Check for GPU</h3>



<p>We are about to design and create the neural network, but first, let&#8217;s check if a GPU is available. </p>



<p>One of the advantages PyTorch has as a neural network framework is that it supports the use of a GPU. The use of a GPU will implement parallel processing to greatly speed up computation. </p>



<p>Depending on the problem, at least an order of magnitude faster processing can be achieved.</p>



<p>Use of a GPU with PyTorch is very easy. First, use the function <code>torch.cuda.is_available()</code> to test if a GPU is available and properly configured for use by PyTorch (PyTorch uses the CUDA framework for using the GPU).</p>



<p>If a GPU is available, we will send the model and the data tensors to the GPU for processing.</p>



<p>The following tests for availability of a GPU, then sets a variable device to either <code>'cpu'</code> or <code>'cuda'</code> depending on what is available.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
# Using cuda device</pre>



<h3 class="wp-block-heading">Create the Neural Network<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.2s8eyo1"></a></h3>



<p>Now let&#8217;s design and create the neural network. We do this by creating a class, which we have chosen to call <code>NeuralNet</code>, which is a subclass of the <code>nn.Module</code> library. </p>



<p>Here is the code to specify and then create our neural network:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class NeuralNet(nn.Module):
  def __init__(self):
    super().__init__()          # Required to properly initialize class, ensures inheritance of the parent __init__() method
    self.flat_f = nn.Flatten()  # Creates function to smartly flatten tensor
    self.neur_net = nn.Sequential(
        nn.Linear(28*28, 512),
        nn.ReLU(),
        nn.Linear(512, 256),
        nn.ReLU(),
        nn.Linear(256,10)
    )

  def forward(self, x):
    x = self.flat_f(x)
    logits = self.neur_net(x)
    return logits

model = NeuralNet().to(device)</pre>



<p>There are a number of important details to review in this code.</p>



<p>First, our neural network definition class <em>must</em> have two methods included: an <code><a href="https://blog.finxter.com/python-init/" data-type="post" data-id="5133" target="_blank" rel="noreferrer noopener">__init__()</a></code> method, and a <code>forward()</code> method. </p>



<p>Classes in Python routinely include an <code>__init__()</code> method to initialize variables and other things in the object that is created. The class must also include a <code>forward()</code> method, which tells PyTorch how to process the data during the forward pass of the data. </p>



<p>Let&#8217;s go over each of these in more detail.</p>



<h3 class="wp-block-heading">Creating the Model: __init__() Method<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.17dp8vu"></a></h3>



<p>First, within the <code>__init__()</code> method note the <code>super().__init__()</code> command. When we create a subclass it inherits the parent class variables and methods. </p>



<p>However, when we write an <code>__init__()</code> method in the subclass, that overrides inheritance of the <code>__init__()</code> method from the parent class. </p>



<p>However there are features in the parent class&#8217; <code>__init__()</code> that our class needs to inherit. The <code>super()__.init__()</code> command achieves this. In effect, it says <em>&#8220;include the parent class <code>__init__()</code> within our child class&#8221;</em>. </p>



<p>To make a long story short, this is necessary to properly initialize our child class, by including some things needed from the parent <code>nn.Module</code> class.</p>



<p>Next, note creating a function from the <code>nn.Flatten()</code> function. Even though our data is a 28&#215;28 pixel two-dimensional image, the processing still works if we convert it into a one-dimensional vector, stacking row by row next to one another to form a 28&#215;28 = 784 element vector (in fact making this change is a common choice).</p>



<p>The <code>flatten()</code> function achieves this. However, the standard <code>flatten()</code> (note the lower case <code>'f'</code>) function will flatten everything, turning a 100 image minibatch tensor of shape (100, 1, 28, 28) into a single vector of shape (78400). </p>



<p>Instead, if we create a function from the <code>nn.Flatten()</code> function (note the upper case <code>'F'</code>), this is smart enough to know to eliminate the single-element dimension and merge the last two dimensions, resulting in a tensor of shape (100, 784), representing a list of 100 vectors of 784 elements. </p>



<p class="has-global-color-8-background-color has-background"><strong>Note</strong>: double-check to make sure your function is flattening properly. If not, the <code>Flatten()</code> function can include some parameters that specify which dimensions to flatten. See documentation for details.</p>



<p>The last thing we do in the <code>__init__()</code> method is specify the neural network structure using the <code>nn.Sequential()</code> function. </p>



<p>Here we list the neural network layers in sequence from beginning to end. </p>



<p>First, we list an input layer of 28&#215;28=784 neurons, connecting through linear (weights * input + bias) connections to 512 neurons. These 512 neurons then pass data through a non-linear ReLU <a href="https://blog.finxter.com/bitcoin-price-forecast-with-lstm-based-architectures/" data-type="post" data-id="782261" target="_blank" rel="noreferrer noopener">activation function</a> layer. </p>



<p>Those signals then go through another linear layer connecting 512 neurons to 256 neurons. These signals then go through another ReLU activation function layer. Finally, the signals go through a final linear layer connecting the 256 neurons to 10 final output neurons.</p>



<p><code>'ReLU'</code> stands for <code>'Rectified Linear Unit'</code>. It is one of many non-linear activation functions which can be chosen. </p>



<p>It is defined as:</p>



<pre class="wp-block-preformatted"><code>f(x) = x, if x&gt;=0
else f(x) = 0</code></pre>



<p>Here is a graph of the ReLU function:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="376" height="251" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-197.png" alt="" class="wp-image-903892" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-197.png 376w, https://blog.finxter.com/wp-content/uploads/2022/11/image-197-300x200.png 300w" sizes="auto, (max-width: 376px) 100vw, 376px" /></figure>
</div>


<h3 class="wp-block-heading">Creating the Model: forward() Method</h3>



<p>The second required method for our class is the <code>forward()</code> method. </p>



<p>As mentioned the <code>forward()</code> method tells <a href="https://blog.finxter.com/tensorflow-vs-pytorch/" data-type="post" data-id="692005" target="_blank" rel="noreferrer noopener">PyTorch</a> how to process the data during the forward pass. Here we first flatten our tensor using the flatten function we defined previously under <code>__init__()</code>.</p>



<p>Then we pass the tensor through the <code>self.neur_net()</code> function we defined previously using the <code>nn.Sequential()</code> function. Finally, the results are returned.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Important point</strong>: the programmer will NOT be using <code>forward(</code>) method in any classes or functions, it is just for PyTorch&#8217;s use. PyTorch expects such a method, so it must be written, but the programmer will not directly use it in any subsequent code.</p>



<p>Finally, we create the neural network (here named <code>'model'</code>) by creating an instance of our <code>NeuralNet()</code> class. In addition, we move the model to the GPU (if available) by including the <code>.to(device)</code> method.</p>



<p>Finally, we can choose to <a href="https://blog.finxter.com/python-print/" data-type="post" data-id="20731" target="_blank" rel="noreferrer noopener">print</a> the model to examine the neural network object we have built:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">print(model)</pre>



<p>Output:</p>



<pre class="wp-block-preformatted"><code>NeuralNet(
  (flat_f): Flatten(start_dim=1, end_dim=-1)
  (neur_net): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=10, bias=True)
  )
)</code></pre>



<p></p>



<h2 class="wp-block-heading">Step 6: Choose Loss Function and Optimizer<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.26in1rg"></a></h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="768" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-209-1024x768.png" alt="" class="wp-image-904051" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-209-1024x768.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-209-300x225.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-209-768x576.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-209.png 1250w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Next, we&#8217;ll need to specify our loss function and our optimizer algorithm.</p>



<h3 class="wp-block-heading">Choosing Cross Entropy Loss</h3>



<p>Recall the loss function measures how far the model&#8217;s guess is from the correct answer for a given input. Adjusting weights and biases to minimize loss is how neural networks learn (see the Finxter article <a rel="noreferrer noopener" href="https://blog.finxter.com/how-neural-networks-learn/" target="_blank">&#8220;How Neural Networks Learn&#8221;</a> for details.).</p>



<p>There are multiple choices of loss functions available, and learning about these various functions is something you will want to do, because which loss choice is most suitable depends on the particular kind of problem you are solving.</p>



<p>In this case, we are sorting images into multiple categories. </p>



<p>One of the most suitable loss choices for this case is <em>cross-entropy loss</em>. Cross entropy is an idea taken from information theory, and it is a measure of how many extra bits must be sent when sending a message using a sub-optimized code.</p>



<p>This is beyond the scope of this exercise, but we can understand its usefulness to our situation if we examine the calculation involved:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="274" height="82" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-258.png" alt="" class="wp-image-914876"/></figure>
</div>


<p>That is, for each category multiply the true probability <em>t</em> by the log of the model&#8217;s estimated probability <em>p</em>, and add them all up. </p>



<p>Of course, <em>t</em> is zero for each incorrect category, and 1 for the correct category. </p>



<p>Consequently, for any given image, just the correct category is selected to contribute to the loss calculation, and that loss is the negative of the log of the probability estimate.</p>



<p>Recall this is what the <code>log()</code> function looks like:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="385" height="261" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-199.png" alt="" class="wp-image-903924" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-199.png 385w, https://blog.finxter.com/wp-content/uploads/2022/11/image-199-300x203.png 300w" sizes="auto, (max-width: 385px) 100vw, 385px" /></figure>
</div>


<p>Since the network provides a probability estimate we are only interested in the interval <code>(0,1]</code>. Here is what the negative of the <code>log()</code> looks like over that interval:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="396" height="251" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-200.png" alt="" class="wp-image-903925" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-200.png 396w, https://blog.finxter.com/wp-content/uploads/2022/11/image-200-300x190.png 300w" sizes="auto, (max-width: 396px) 100vw, 396px" /></figure>
</div>


<p>So the loss is very large when the network gives a low probability estimate (near zero) for the correct category, and the loss is lowest (near zero) when the network gives a high probability estimate (near 1.0) for the correct category.</p>



<p>Here is the code specifying cross entropy loss as the loss function:</p>



<p></p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">loss_fn = nn.CrossEntropyLoss()</pre>



<h3 class="wp-block-heading">Choosing Optimizer Algorithm</h3>



<p>We also need to choose the optimizer algorithm. This is the method used to minimize the loss through training. Multiple different optimizers may be chosen, and you will want to learn about the various optimizers available. </p>



<p>All are variations on gradient descent. </p>



<p>For example, some include extinction of the learning rate; others include momentum that helps drive loss away from local minima.</p>



<p>In our case, we will choose plain-old vanilla stochastic gradient descent. Here is the code specifying the optimizer and its learning rate:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">learning_rate = 1e-3
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)</pre>



<h2 class="wp-block-heading">Step 7: Specify Training and Testing Functions</h2>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="682" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-210-1024x682.png" alt="" class="wp-image-904055" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-210-1024x682.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/11/image-210-300x200.png 300w, https://blog.finxter.com/wp-content/uploads/2022/11/image-210-768x512.png 768w, https://blog.finxter.com/wp-content/uploads/2022/11/image-210.png 1255w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Now we define functions for training and testing the neural network.</p>



<h3 class="wp-block-heading">Training Function</h3>



<p>Here is the code specifying the training function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def train_nn(dataloader, model, loss_fn, optimizer):
  size = len(dataloader.dataset)
  for batch, (X, y) in enumerate(dataloader):
    X, y = X.to(device), y.to(device)
        
    # For each image in batch X, compute prediction
    pred = model(X)
    # Compute average loss for the set of images in batch
    loss = loss_fn(pred, y)

    # Backpropagation
    optimizer.zero_grad()   # Zero gradients
    loss.backward()         # Computes gradients
    optimizer.step()        # Update weights, biases according to gradients, factored by learning rate

    if batch % 100 == 0:      # Report progress every 100 batches
      loss, current = loss.item(), batch * len(X)
      print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")</pre>



<p>We pass into the function the dataloader, model, loss function, and optimizer objects.</p>



<p>The function then loops over minibatches from the dataloader.</p>



<p>For each loop, a minibatch of the input images X and the labels y is retrieved and then moved to the GPU (if available). </p>



<p>Then the neural network model calculates predictions from the input images X. These predictions and the correct labels y are used to calculate the loss (note this loss is a single number that is the average loss for the minibatch).</p>



<p>Once the loss is calculated, the function can adjust weights and biases (backpropagate) in three code steps. </p>



<p>First, gradient attributes are zeroed out using <code>optimizer.zero_grad()</code> (PyTorch defaults to accumulating gradient calculations, so they need to be zeroed out on each iteration of the loop, or else they&#8217;ll keep accumulating data). </p>



<p>Then the gradients are calculated using <code>loss.backward()</code>. Finally, weights and biases are updated according to the gradients using <code>optimizer.step()</code>.</p>



<p>Finally, a small section is included to report progress every 100 batches. This prints out the current loss, and how many images of the total images have been completed.</p>



<h3 class="wp-block-heading">Testing Function</h3>



<p>Here is the code specifying the testing function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def test_loop(dataloader, model, loss_fn):      # After each epoch, test training results (report categorizing accuracy, loss)
    size = len(dataloader.dataset)              # Number of image/label pairs in dataset
    num_batches = len(dataloader)
    test_loss, correct = 0, 0                   # Initialize variables tracking loss and accuracy during test loop

    with torch.no_grad():                       # Disable gradient tracking - reduces resource use and speeds up processing
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)

            pred = model(X)                     # Get predictions from the neural network based on input minibatch X
            test_loss += loss_fn(pred, y).item()  # Accumulate loss values during loop through dataset
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()    # Accumulate correct predictions during loop through dataset

    test_loss /= num_batches                    # Calculate average loss
    correct /= size                             # Calculate accuracy rate
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")   # Report test results</pre>



<p>This function tests the accuracy of the network using the test data. </p>



<p>First, we pass in the testing data loader, the model, and the loss function (for testing loss). Then the function initializes several variables, especially <code>test_loss</code> and correct for accumulating test results during the test loop.</p>



<p>The function does the next few steps within a with <code>torch.no_grad()</code>: subsection. </p>



<p>Here is why: PyTorch stores calculations from the forward pass for later use during the backpropagation gradient calculations. </p>



<p>The <code>torch.no_grad()</code> method turns that off while in this with subsection, since there will be only a forward pass during the testing. This saves resources and speeds up processing. You will want to do the same thing once you have a trained network that is used for classifying in production. </p>



<p>After leaving the with subsection the calculation-storing feature automatically resumes.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: be aware that storing calculations is turned on (<code>requires_grad=True</code>) because we are using Modules from the <code>nn</code> library (Linear, ReLU). Otherwise, PyTorch tensors default to <code>requires_grad=False</code>.</p>



<p>Then the function uses a for loop to iterate through the minibatches of the test dataloader. For each iteration, the neural network model computes predictions from the minibatch of images. The loss is calculated for the minibatch, which is then accumulated in <code>test_loss</code>.</p>



<p>Then the number of correct predictions for the minibatch is found as follows: first note that pred is a set of 10-element vectors, with each element an estimate of the probability of that element index being the correct prediction. </p>



<p>The <code>.argmax(1)</code> method returns the index of the largest estimate (the number 1 in the <code>argmax()</code> argument indicates which dimension to use for the operation). This list (tensor) of indices is compared to the list (tensor) of correct labels in <code>y</code>. </p>



<p>This results in a list (tensor) containing <code>True</code> where there is a match, and <code>False</code> otherwise. The <code>type(torch.float)</code> method converts these into floating point 1&#8217;s and 0&#8217;s. </p>



<p>The <code>sum()</code> method adds all the elements together. Then finally, the <code>.item()</code> method converts the totaled one-element tensor into a raw number (scalar). </p>



<p>Finally, we have the total number of correct predictions for that batch, which is added to the correct variable that accumulates the total number of correct predictions as the for loop iterates through the dataloader.</p>



<h3 class="wp-block-heading">Train and Test the Network</h3>



<p>Now we have written enough code, we can write a small main program loop to train and test the network. We specify how many epochs we wish to run, then we loop through those epochs, training and testing the network for each one.</p>



<p>Here is the code:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># The main program!

epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train_nn(mnist_train_dl, model, loss_fn, optimizer)
    test_loop(mnist_test_dl, model, loss_fn)
print("Done!")</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Epoch 1
-------------------------------
loss: 2.102096  [    0/60000]
loss: 2.119211  [10000/60000]
loss: 2.068424  [20000/60000]
loss: 2.056982  [30000/60000]
loss: 2.028877  [40000/60000]
loss: 1.995214  [50000/60000]
Test Error: 
 Accuracy: 65.9%, Avg loss: 2.000194 

Epoch 2
-------------------------------
loss: 2.018245  [    0/60000]
loss: 1.996478  [10000/60000]
loss: 1.969913  [20000/60000]
loss: 1.999372  [30000/60000]
loss: 1.944238  [40000/60000]
loss: 1.863184  [50000/60000]
Test Error: 
 Accuracy: 67.8%, Avg loss: 1.866808 

Epoch 3
-------------------------------
loss: 1.921477  [    0/60000]
loss: 1.891367  [10000/60000]
loss: 1.840778  [20000/60000]
loss: 1.751534  [30000/60000]
loss: 1.718531  [40000/60000]
loss: 1.800236  [50000/60000]
Test Error: 
 Accuracy: 69.5%, Avg loss: 1.695623 

Epoch 4
-------------------------------
loss: 1.692079  [    0/60000]
loss: 1.752511  [10000/60000]
loss: 1.600570  [20000/60000]
loss: 1.582768  [30000/60000]
loss: 1.532521  [40000/60000]
loss: 1.569566  [50000/60000]
Test Error: 
 Accuracy: 71.9%, Avg loss: 1.498120 

Epoch 5
-------------------------------
loss: 1.507337  [    0/60000]
loss: 1.515740  [10000/60000]
loss: 1.437465  [20000/60000]
loss: 1.424620  [30000/60000]
loss: 1.409456  [40000/60000]
loss: 1.385026  [50000/60000]
Test Error: 
 Accuracy: 74.6%, Avg loss: 1.300192 

Done!
</pre>



<p>After just 5 epochs, the accuracy isn&#8217;t very good yet, but we can see that things are moving in the right direction. </p>



<p>Obviously, if we wanted to get good performance we would need to train for more epochs. Figuring out how much to train (being careful not to overfit!) is something a neural network engineer has to work out.</p>



<h2 class="wp-block-heading">Reviewing the Big Picture</h2>



<p>It may seem like we have gone over a lot, and we have, but if you step back and look at the big picture there isn&#8217;t a lot here. </p>



<p>It may seem like a lot because we have reviewed everything in detail to make sure we convey full understanding. </p>



<p>However, to gain some perspective, let&#8217;s show all the essential code, without all the extra description and explanation (note, we&#8217;re also skipping the code here used to review the dataset):</p>



<h3 class="wp-block-heading">Import Necessary Libraries</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import torch
from torch import nn
from torchvision import datasets
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor</pre>



<h3 class="wp-block-heading">Acquire the Data<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.2et92p0"></a></h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># Download MNIST data, put it in pytorch dataset
mnist_data = datasets.MNIST(
    root='mnist_nn',
    train=True,
    download=True,
    transform=ToTensor()
)
</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">mnist_test_data = datasets.MNIST(
    root='mnist_nn',
    train=False,
    download=True,
    transform=ToTensor()
)</pre>



<h3 class="wp-block-heading">Create Dataloaders<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.3dy6vkm"></a></h3>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">batch_size = 100
mnist_train_dl = DataLoader(mnist_data,
                      batch_size=batch_size,
                      shuffle=True)

mnist_test_dl = DataLoader(mnist_test_data,
                           batch_size=batch_size,
                           shuffle=True)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h3 class="wp-block-heading">Check for GPU</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")
# Using cuda device</pre>



<h3 class="wp-block-heading">Design and Create the Neural Network<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.1t3h5sf"></a></h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">class NeuralNet(nn.Module):
  def __init__(self):
    super().__init__()          # Required to properly initialize class, ensures inheritance of the parent __init__() method
    self.flat_f = nn.Flatten()  # Creates function to smartly flatten tensor
    self.neur_net = nn.Sequential(
        nn.Linear(28*28, 512),
        nn.ReLU(),
        nn.Linear(512, 256),
        nn.ReLU(),
        nn.Linear(256,10)
    )

  def forward(self, x):
    x = self.flat_f(x)
    logits = self.neur_net(x)
    return logits

model = NeuralNet().to(device)</pre>



<h3 class="wp-block-heading">Choose Loss Function and Optimizer</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">loss_fn = nn.CrossEntropyLoss()
learning_rate = 1e-3
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)</pre>



<h3 class="wp-block-heading">Specify Training and Testing Functions<a href="https://docs.google.com/document/d/1ChXcbOjMg_yJBiWl8_GRcCDE_s7PprXAm3_wYWcYD2U/edit#bookmark=id.1ksv4uv"></a></h3>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def train_nn(dataloader, model, loss_fn, optimizer):
  size = len(dataloader.dataset)
  for batch, (X, y) in enumerate(dataloader):
    X, y = X.to(device), y.to(device)
        
    # For each image in batch X, compute prediction
    pred = model(X)
    # Compute average loss for the set of images in batch
    loss = loss_fn(pred, y)

    # Backpropagation
    optimizer.zero_grad()   # Zero gradients
    loss.backward()         # Computes gradients
    optimizer.step()        # Update weights, biases according to gradients, factored by learning rate

    if batch % 100 == 0:      # Report progress every 100 batches
      loss, current = loss.item(), batch * len(X)
      print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def test_loop(dataloader, model, loss_fn):      # After each epoch, test training results (report categorizing accuracy, loss)
    size = len(dataloader.dataset)              # Number of image/label pairs in dataset
    num_batches = len(dataloader)
    test_loss, correct = 0, 0                   # Initialize variables tracking loss and accuracy during test loop

    with torch.no_grad():                       # Disable gradient tracking - reduces resource use and speeds up processing
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)

            pred = model(X)                     # Get predictions from the neural network based on input minibatch X
            test_loss += loss_fn(pred, y).item()  # Accumulate loss values during loop through dataset
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()    # Accumulate correct predictions during loop through dataset

    test_loss /= num_batches                    # Calculate average loss
    correct /= size                             # Calculate accuracy rate
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")   # Report test results</pre>



<h3 class="wp-block-heading">Train and Test the Network</h3>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group=""># The main program!

epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train_nn(mnist_train_dl, model, loss_fn, optimizer)
    test_loop(mnist_test_dl, model, loss_fn)
print("Done!")</pre>



<p>Really we have written just a few dozen lines of code, comparable to the size program a hobbyist programmer might write. </p>



<p>Yet we&#8217;ve built a world-class neural network that converts hand-written digits to numbers a computer can work with. That&#8217;s pretty amazing!</p>



<p>Of course, this is all possible thanks to the efforts of the many engineers who wrote the many more lines of code within PyTorch. Thank you to all of you who have contributed to PyTorch! </p>



<p>This is another example of achieving great things by standing on the shoulders of giants!</p>



<h2 class="wp-block-heading">Saving and Reloading the Network</h2>



<p>We have built, trained, and tested a neural network, and that&#8217;s great. But really, the point of training a neural network is to put it to use. To support that, we need to be able to save and reload the network for later use.</p>



<p>Use the following code to save the weights and biases of your neural network (<strong>note</strong>: the common convention is to save these files with extension <code>.pt</code> or <code>.pth</code>):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">torch.save(network_name.state_dict(), 'filename.pth')</pre>



<p>Since we named our network model we would save as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">torch.save(model.state_dict(), 'model_weights.pth')</pre>



<p>To reload, first create an instance of your neural network (make sure you have access to the class/neural network you originally specified). In our example:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">user_model = NeuralNet().to(device)</pre>



<p>Then load the new instance with your saved weights and biases:</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">user_model.load_state_dict(torch.load('model_weights.pth'))
# &lt;All keys matched successfully></pre>



<p>Some of the modules perform differently when in training rather than when in use. </p>



<p>Specifically, when in training mode, some of them implement various <em>regularization methods</em> which are used to resist the onset of overfitting. </p>



<p>These methods may include some randomness and can cause the network to give inconsistent results. To avoid this, make sure you are in evaluation mode and not training mode:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">user_model.eval()</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">NeuralNet(
  (flat_f): Flatten(start_dim=1, end_dim=-1)
  (neur_net): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=10, bias=True)
  )
)</pre>



<p>As you can see this command conveniently reports the neural network structure.</p>



<p>Let&#8217;s make sure our reloaded network works.</p>



<p>It would be best to test with some new handwritten digits, but for the sake of convenience lets just test it with the first ten test images (especially since the network was not trained very heavily). </p>



<p>Let&#8217;s look at these first ten images in the test dataset:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">fig, axs = plt.subplots(2, 5, figsize=(8, 5))
for a_row in range(2):
  for a_col in range(5):
    img_no = a_row*5 + a_col
    img = mnist_test_data[img_no][0].squeeze()
    img_tgt = mnist_test_data[img_no][1]
    axs[a_row][a_col].imshow(img, cmap='gray')
    axs[a_row][a_col].set_xticks([])
    axs[a_row][a_col].set_yticks([])
    axs[a_row][a_col].set_title(img_tgt, fontsize=20)
plt.show()</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="616" height="342" src="https://blog.finxter.com/wp-content/uploads/2022/11/image-259.png" alt="" class="wp-image-914879" srcset="https://blog.finxter.com/wp-content/uploads/2022/11/image-259.png 616w, https://blog.finxter.com/wp-content/uploads/2022/11/image-259-300x167.png 300w" sizes="auto, (max-width: 616px) 100vw, 616px" /></figure>
</div>


<p>Now let&#8217;s see if the network detects these images properly:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def eval_image(model, imgno):
  testimg = mnist_test_data[imgno][0]       # assign first image to variable 'testimg'
  testimg = testimg.to(device)              # move image data to GPU
  logits = model(testimg)                   # run image through network
  return logits.argmax().item()             # argmax id's value, returns it

for img_no in range(10):
  img_val = eval_image(model, img_no)
  print(img_val)
</pre>



<p>Output:</p>



<pre class="wp-block-preformatted"><code>
7
2
1
0
4
1
7
9
6
7</code>
</pre>



<p>The results are not perfect, but for an incompletely trained network that&#8217;s not bad! The few failure are plausible given the incomplete training. Our network works with the saved and reloaded weights and biases!</p>



<h2 class="wp-block-heading">Conclusion</h2>



<p>We hope you have found this article educational, and we hope it inspires you to go and build your own working neural networks using PyTorch!</p>
<p>The post <a href="https://blog.finxter.com/using-pytorch-to-build-a-working-neural-network/">Using PyTorch to Build a Working Neural Network</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Tensors: The Vocabulary of Neural Networks</title>
		<link>https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Fri, 26 Aug 2022 13:20:25 +0000</pubDate>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Deep Learning]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Math]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=616223</guid>

					<description><![CDATA[<p>In this article, we will introduce one of the core elements describing the mathematics of neural networks: tensors. 🧬 Although typically, you won&#8217;t work directly with tensors (usually they operate under the hood), it is important to understand what&#8217;s going on behind the scenes. In addition, you may often wish to examine tensors so that ... <a title="Tensors: The Vocabulary of Neural Networks" class="read-more" href="https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/" aria-label="Read more about Tensors: The Vocabulary of Neural Networks">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/">Tensors: The Vocabulary of Neural Networks</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>In this article, we will introduce one of the core elements describing the mathematics of neural networks: <strong>tensors</strong>. <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9ec.png" alt="🧬" class="wp-smiley" style="height: 1em; max-height: 1em;" /></p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Tensors: The Vocabulary of Neural Networks" width="937" height="527" src="https://www.youtube.com/embed/VybtsVcIoSg?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p>Although typically, you won&#8217;t work directly with tensors (usually they operate under the hood), it is important to understand what&#8217;s going on behind the scenes. In addition, you may often wish to examine tensors so that you can look directly at the data, or look at the arrays of weights and biases, so it&#8217;s important to be able to work with tensors.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: This article assumes you are familiar with how neural networks work. To review those basics, see the article <a href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/" target="_blank" rel="noreferrer noopener">The Magic of Neural Networks: History and Concepts</a>. It also assumes you have some familiarity with <a href="https://blog.finxter.com/an-introduction-to-python-classes-inheritance-encapsulation-and-polymorphism/" data-type="post" data-id="30977" target="_blank" rel="noreferrer noopener">Python&#8217;s object oriented programming</a>.</p>



<p>Theoretically, we could use pure Python to implement neural networks. </p>



<ul class="wp-block-list">
<li>We could use <a rel="noreferrer noopener" href="https://blog.finxter.com/python-lists/" data-type="post" data-id="7332" target="_blank">Python lists</a> to represent <strong>data</strong> in the network; </li>



<li>We could use other lists representing <strong>weights and biases</strong> in the network; and </li>



<li>We could use <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-write-a-nested-for-loop-in-one-line-python/" data-type="post" data-id="11859" target="_blank">nested <code>for</code> loops</a> to perform the operations of multiplying the inputs by the connection weights.</li>
</ul>



<p>There are a few issues with this, however: Python, especially the list data type, performs rather slowly. Also, the code would not be very readable with nested <code>for</code> loops.</p>



<p>Instead, the libraries that implement <a rel="noreferrer noopener" href="https://blog.finxter.com/how-neural-networks-learn/" data-type="post" data-id="568016" target="_blank">neural networks</a> in software packages such as <a rel="noreferrer noopener" href="https://blog.finxter.com/pytorch-developer-income-and-opportunity/" data-type="post" data-id="255891" target="_blank">PyTorch</a> use tensors, and they run much more quickly than pure Python. Also, as you will see, tensors allow much more readable descriptions of networks and their data.</p>



<h2 class="wp-block-heading" id="Tensors">Tensors<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Tensors"></a></h2>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Tensors</strong> are essentially arrays of values. Since neural networks are essentially arrays of neurons, tensors are a natural fit for describing them. They can be used for describing the data, describing the network connection weights, and other things.</p>



<p>A one-dimensional tensor is known as a <strong>vector</strong>. Here is an example:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="100" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-75-1024x100.png" alt="" class="wp-image-616229" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-75-1024x100.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-75-300x29.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-75-768x75.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-75.png 1319w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Vectors can also be written horizontally. Here&#8217;s the same vector written horizontally:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="49" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-76-1024x49.png" alt="" class="wp-image-616230" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-76-1024x49.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-76-300x14.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-76-768x36.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-76.png 1329w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Switching a vector from vertical to horizontal, or vice versa, is called <strong>transposing</strong>, and is sometimes needed depending on the math specifics. We will not go into detail on this in this article (see <a href="https://blog.finxter.com/pandas-dataframe-t-and-transpose-method/" data-type="post" data-id="343967" target="_blank" rel="noreferrer noopener">here for more</a>).</p>



<p>Vectors are typically used to represent data in the network. For example, each individual element in a vector can represent the input value for each individual input neuron in the network.</p>



<h3 class="wp-block-heading">2D Tensor Matrix</h3>



<p>A two-dimensional tensor is known as a <strong>matrix</strong>. Here&#8217;s an example:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="75" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-77-1024x75.png" alt="" class="wp-image-616231" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-77-1024x75.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-77-300x22.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-77-768x57.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-77.png 1332w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>For a fully connected network, where each neuron in one layer connects to every neuron in the next layer, a matrix is typically used to represent all the connection weights. If there are <code>m</code> neurons connected to <code>n</code> neurons you would need an <code>n x m</code> matrix to describe all the connection weights.</p>



<p>Here&#8217;s an example of two neurons connected to three neurons. Here is the network, with connection weights included:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="442" height="402" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-78.png" alt="" class="wp-image-616232" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-78.png 442w, https://blog.finxter.com/wp-content/uploads/2022/08/image-78-300x273.png 300w" sizes="auto, (max-width: 442px) 100vw, 442px" /></figure>
</div>


<p>And here is the connection weights matrix:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="106" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-81-1024x106.png" alt="" class="wp-image-616241" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-81-1024x106.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-81-300x31.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-81-768x80.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-81.png 1331w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<h2 class="wp-block-heading" id="Why-We-Use-Tensors">Why We Use Tensors<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Why-We-Use-Tensors"></a></h2>



<p>Before we finish introducing tensors, let&#8217;s use what we&#8217;ve seen so far to see why they&#8217;re so important to use when modeling neural networks. </p>



<p>Let&#8217;s introduce a two-element vector of data and run it through the network we just showed. </p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Info</strong>: Recall neurons add together their weighted inputs, then run the result through an <a href="https://blog.finxter.com/how-neural-networks-learn/" data-type="post" data-id="568016" target="_blank" rel="noreferrer noopener">activation function</a>. </p>



<p>In this example, we are ignoring the activation function to keep things simple for the demonstration.</p>



<p>Here is our data vector:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="76" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-82-1024x76.png" alt="" class="wp-image-616242" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-82-1024x76.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-82-300x22.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-82-768x57.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-82.png 1328w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Here&#8217;s a diagram depicting the operation:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="442" height="402" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-84.png" alt="" class="wp-image-616247" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-84.png 442w, https://blog.finxter.com/wp-content/uploads/2022/08/image-84-300x273.png 300w" sizes="auto, (max-width: 442px) 100vw, 442px" /></figure>
</div>


<p>Let&#8217;s calculate the operation (the neuron computations) by hand:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="236" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-83-1024x236.png" alt="" class="wp-image-616246" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-83-1024x236.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-83-300x69.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-83-768x177.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-83.png 1330w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>The final result is a 3 element vector:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="97" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-79-1024x97.png" alt="" class="wp-image-616236" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-79-1024x97.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-79-300x28.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-79-768x73.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-79.png 1333w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>If you have learned about matrices in grade school and remember doing <strong><a href="https://blog.finxter.com/numpy-matmul-operator/" data-type="post" data-id="374" target="_blank" rel="noreferrer noopener">matrix multiplication</a></strong>, you may note that what we just calculated is <em>identical</em> to matrix multiplication:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="99" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-80-1024x99.png" alt="" class="wp-image-616237" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-80-1024x99.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-80-300x29.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-80-768x74.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-80.png 1326w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: Recall matrix multiplication involves multiplying first matrix rows by second matrix columns element-wise, then adding elements together.</p>



<p>This is why tensors are so important for neural networks: <em>tensor math precisely describes neural network operation</em>.</p>



<p>As an added benefit, the equation above showing matrix multiplication is so much more a succinct description than nested <code>for</code> loops would be. </p>



<p>If we introduce the nomenclature of bold lower case for a vector and bold upper case for a matrix, then the operation of vector data running through a neural network weight matrix is described by this very compact equation:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="51" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-85-1024x51.png" alt="" class="wp-image-616248" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-85-1024x51.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-85-300x15.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-85-768x38.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-85.png 1328w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>We will see later that matrix multiplication within PyTorch is a similarly compact code equation.</p>



<h2 class="wp-block-heading" id="Larger-dimensional-tensors">Higher Dimensional Tensors</h2>



<p>A three-dimensional (3D) tensor is known simply as a <em>tensor</em>. As you can see, the term <em>tensor </em>generically refers to <em>any dimensional array of numbers</em>. It&#8217;s just one-dimensional and two-dimensional tensors that have the unique names &#8220;vector&#8221; and &#8220;matrix&#8221; respectively.</p>



<p>You might not think that there is a need for three-dimensional and larger tensors, but that&#8217;s not quite true. </p>



<p>A grayscale image is clearly a two-dimensional tensor, in other words, a matrix. But a color image is actually three two-dimensional arrays, one each for red, green, and blue color channels. So a color image is essentially a three-dimensional tensor. </p>



<p>In addition, typically we process data in mini-batches. So if we&#8217;re processing a mini-batch of color images we have the three-dimensional aspect already noted, plus one more dimension of the list of images in the mini-batch. So a mini-batch of color images can be represented by a four-dimensional tensor.</p>



<h2 class="wp-block-heading" id="Tensors-in-Neural-Network-Libraries">Tensors in Neural Network Libraries<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Tensors-in-Neural-Network-Libraries"></a></h2>



<p>One Python library that is well suited to working with arrays is <a rel="noreferrer noopener" href="https://blog.finxter.com/numpy-tutorial/" data-type="post" data-id="1356" target="_blank">NumPy</a>. In fact, NumPy is used by some users for implementing neural networks. One example is the <a href="https://blog.finxter.com/how-to-install-scikit-learn-in-python/" data-type="post" data-id="35974" target="_blank" rel="noreferrer noopener">scikit-learn</a> machine learning library which works with NumPy.</p>



<p>However, the PyTorch implementation of tensors is more powerful than NumPy arrays. PyTorch tensors are designed with neural networks in mind. PyTorch tensors have these advantages:</p>



<ol class="wp-block-list">
<li>PyTorch tensors include gradient calculations integrated into them.</li>



<li>PyTorch tensors also support GPU calculations, substantially speeding up neural network calculations.</li>
</ol>



<p>However, if you are used to working with NumPy, you should feel fairly at home with PyTorch tensors. Though the commands to create PyTorch tensors are slightly different, they will feel fairly familiar. For the rest of this article, we will focus exclusively on PyTorch tensors.</p>



<h2 class="wp-block-heading" id="Tensors-in-PyTorch:-Creating-Them,-and-Doing-Math">Tensors in PyTorch: Creating Them, and Doing Math<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Tensors-in-PyTorch:-Creating-Them,-and-Doing-Math"></a></h2>



<p>OK, let&#8217;s finally do some coding!</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="498" height="466" src="https://blog.finxter.com/wp-content/uploads/2022/08/WhenTheCodingCodingGIF.gif" alt="" class="wp-image-616332"/></figure>
</div>


<p></p>



<p>First, make sure that you have PyTorch available, either by <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-pytorch-on-pycharm/" data-type="post" data-id="35142" target="_blank">installing</a> on your system or by accessing it through online Jupyter notebook servers. </p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f30d.png" alt="🌍" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Reference</strong>: See <a rel="noreferrer noopener" href="https://PyTorch.org/get-started/locally/" data-type="URL" data-id="https://PyTorch.org/get-started/locally/" target="_blank">PyTorch&#8217;s website</a> for instructions on how to install it on your own system.</p>



<p>See this Finxter article for a review of available online Jupyter notebook services:</p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f30d.png" alt="🌍" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended Tutorial</strong>: <a href="https://blog.finxer.com/survey-of-python-notebook-options">Top 4 Jupyter Notebook Alternatives for Machine Learning</a></p>



<p>For this article, we will use the online Jupyter notebook service provided by Google called <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-check-your-tensorflow-version-in-colab/" data-type="post" data-id="29991" target="_blank">Colab</a>. PyTorch is already installed in Colab; we simply have to import it as a module to use it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import torch</pre>



<p>There are a number of ways of creating tensors in PyTorch. </p>



<p>Typically you would be creating tensors by importing data from data sets available through PyTorch, or by converting your own data into tensors. </p>



<p>For now, since we simply want to demonstrate the use of tensors we will use basic commands to create very simple tensors.</p>



<p>You can create a tensor from a list:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_list = torch.tensor([[1,2], [3,4]])
t_list</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[1, 2],
        [3, 4]])</pre>



<p>Note that when we evaluate the tensor variable, the output is labeled to indicate it as a tensor. This means that it is a PyTorch <strong>tensor object</strong>, so an object within PyTorch that performs just like math tensors, plus has various features provided by PyTorch (such as supporting gradient calculations, and supporting GPU processing).</p>



<p>You can create tensors filled with zeros, filled with ones, or filled with random numbers:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_zeros = torch.zeros(2,3)
t_zeros</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[0., 0., 0.],
        [0., 0., 0.]])</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_ones = torch.ones(3,2)
t_ones
</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand = torch.rand(3,2,4)
t_rand
</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[[0.9661, 0.3915, 0.0263, 0.2753],
         [0.7866, 0.0503, 0.3963, 0.1334]],

        [[0.4085, 0.1816, 0.2827, 0.3428],
         [0.9923, 0.4543, 0.0872, 0.0771]],

        [[0.2451, 0.6048, 0.8686, 0.8148],
         [0.7930, 0.4150, 0.6125, 0.3401]]])</pre>



<p>An important attribute to be familiar with to understand the shape of a tensor is the appropriately named <strong><code><a href="https://blog.finxter.com/how-to-get-shape-of-array/" data-type="post" data-id="268" target="_blank" rel="noreferrer noopener">shape</a></code></strong> attribute:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand.shape
# Output: torch.Size([3, 2, 4])</pre>



<p>This shows you that tensor &#8220;<code>t_rand</code>&#8221; is a three-dimensional tensor composed of three elements of two rows by four columns.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: The dimensions of a tensor is referred to as its <strong><code>rank</code></strong>. A one-dimensional tensor, or vector, is a rank-1 tensor; a two-dimensional tensor, or matrix, is a rank-2 tensor; a three-dimensional tensor is a rank-3 tensor, and so on.</p>



<p>Let&#8217;s do some math with tensors &#8211; let&#8217;s add two tensors together:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="75" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-86-1024x75.png" alt="" class="wp-image-616249" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-86-1024x75.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-86-300x22.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-86-768x56.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-86.png 1325w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>
</div>


<p>Note the tensors are added together <a href="https://blog.finxter.com/how-to-add-two-lists-element-wise-in-python/" data-type="post" data-id="391288" target="_blank" rel="noreferrer noopener">element-wise</a>. Now here it is in PyTorch:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_first = torch.tensor([[1,2], [3,4]])
t_second = torch.tensor([[5,6],[7,8]])
t_sum = t_first + t_second
t_sum</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[ 6,  8],
        [10, 12]])</pre>



<p>Let&#8217;s add a scalar, that is, an independent number (or a rank-0 tensor!) to a tensor:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_add3 = t_first + 3
t_add3</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[4, 5],
        [6, 7]])</pre>



<p>Note that the scalar is added to each element of the tensor. The same applies when multiplying a scalar by a tensor:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_times3 = t_first * 3
t_times3</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[ 3,  6],
        [ 9, 12]])</pre>



<p>The same kind of thing applies to raising a tensor to a power, that is the <a href="https://blog.finxter.com/python-exponent-operator/" data-type="post" data-id="31606" target="_blank" rel="noreferrer noopener">power operation</a> is applied element-wise:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_squared = t_first ** 2
t_squared
</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[ 1,  4],
        [ 9, 16]])</pre>



<p>Recall that after summing weighted inputs, the neuron processes the result through an activation function. Note that the same performance applies here as well: when a vector is processed through an <a href="https://blog.finxter.com/how-neural-networks-learn/" data-type="post" data-id="568016" target="_blank" rel="noreferrer noopener">activation function</a>, the operation is applied to the vector element-wise.</p>



<p>Earlier, we pointed out that matrix multiplication is an important part of neural network calculations. </p>



<p>There are two ways to do this in PyTorch: you can use the <code><a href="https://blog.finxter.com/python-__matmul__-magic-method/" data-type="post" data-id="36136" target="_blank" rel="noreferrer noopener">matmul</a></code> function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_matmul1 = torch.matmul(t_first, t_second)
t_matmul1</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[19, 22],
        [43, 50]])</pre>



<p>Or you can use the matrix multiplication symbol &#8220;<code><a href="https://blog.finxter.com/numpy-matmul-operator/" data-type="post" data-id="374" target="_blank" rel="noreferrer noopener">@</a></code>&#8220;:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_matmul2 = t_first @ t_second
t_matmul2
﻿</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[19, 22],
        [43, 50]])</pre>



<p>Recall previously, we showed running an input signal through a neural network, where a vector of input signals was multiplied by a matrix of connection weights. </p>



<p>Here is that in PyTorch:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">x = torch.tensor([[7],[8]])
x</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[7],
        [8]])</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">W = torch.tensor([[1,4], [2,5], [3,6]])
W</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[1, 4],
        [2, 5],
        [3, 6]])</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">y = W @ x
y
﻿</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[39],
        [54],
        [69]])</pre>



<p>Note how compact and readable that is instead of doing nested <code>for</code> loops.</p>



<p>Other math can be done with tensors as well, but we have covered most situations that are relevant to neural networks. If you find you need to do additional math with your tensors, check PyTorch documentation or do a web search.</p>



<h2 class="wp-block-heading" id="Indexing-and-Slicing-Tensors">Indexing and Slicing Tensors<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Indexing-and-Slicing-Tensors"></a></h2>



<p><a href="https://blog.finxter.com/introduction-to-slicing-in-python/" data-type="post" data-id="731" target="_blank" rel="noreferrer noopener">Slicing</a> allows you to examine subsets of your data and better understand how the dataset is constructed. You may find you will use this a lot.</p>



<h3 class="wp-block-heading">Indexing Slicing PyTorch vs NumPy vs Python Lists</h3>



<p><a rel="noreferrer noopener" href="https://blog.finxter.com/numpy-boolean-indexing/" data-type="post" data-id="2877" target="_blank">Indexing</a> and slicing tensors work the same way it does with NumPy arrays. Note that the syntax is different from Python lists. With Python lists, a separate pair of brackets are used for each level of nested lists. Instead, with Pytorch one pair of brackets contains all dimensions, separated by commas.</p>



<p>Let&#8217;s find the item in tensor &#8220;<code>t_rand</code>&#8221; that is 2nd element, first row, third column. First here is &#8220;<code>t_rand</code>&#8221; again:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[[0.9661, 0.3915, 0.0263, 0.2753],
         [0.7866, 0.0503, 0.3963, 0.1334]],

        [[0.4085, 0.1816, 0.2827, 0.3428],
         [0.9923, 0.4543, 0.0872, 0.0771]],

        [[0.2451, 0.6048, 0.8686, 0.8148],
         [0.7930, 0.4150, 0.6125, 0.3401]]])</pre>



<p>And here is the item at the 2nd element, first row, and third column (don&#8217;t forget indexing starts at <a rel="noreferrer noopener" href="https://blog.finxter.com/daily-python-puzzle-list-indexing/" data-type="post" data-id="84" target="_blank">zero</a>):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand[1, 0, 2]
# Output: tensor(0.2827)</pre>



<p>Let&#8217;s look at the slice second element, first row, second through third columns:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand[1, 0, 1:3]
# tensor([0.1816, 0.2827])</pre>



<p>Let&#8217;s look at the entire 3rd column:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand[:, :, 2]</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[0.0263, 0.3963],
        [0.2827, 0.0872],
        [0.8686, 0.6125]])</pre>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2139.png" alt="ℹ" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Important Slicing Tip</strong>: In the above, we use the standard Python convention that a blank before a &#8220;<code>:</code>&#8221; means &#8220;start from the beginning&#8221;, and a blank after a &#8220;<code>:</code>&#8221; means &#8220;go all the way to the end&#8221;. So a &#8220;<code>:</code>&#8221; alone means &#8220;include everything from beginning to end&#8221;.</p>



<p>A likely use for slicing would be to look at a full array (i.e. a matrix) within a set of arrays, i.e. one image out of a set of images. </p>



<p>Let&#8217;s pretend our &#8220;<code>t_rand</code>&#8221; tensor is a list of images. We may wish to sample just a few &#8220;images&#8221; to get an idea of what they are like. </p>



<p>Let&#8217;s examine the first &#8220;image&#8221; in our tensor (&#8220;list of images&#8221;):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand[0]</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[0.9661, 0.3915, 0.0263, 0.2753],
        [0.7866, 0.0503, 0.3963, 0.1334]])</pre>



<p>And here is the last array (&#8220;image&#8221;) in tensor &#8220;t_rand&#8221;:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">t_rand[-1]</pre>



<p>Output:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tensor([[0.2451, 0.6048, 0.8686, 0.8148],
        [0.7930, 0.4150, 0.6125, 0.3401]])</pre>



<p>Using small tensors to demonstrate indexing can be instructive, but let&#8217;s see it in action for real. Let&#8217;s examine some real datasets with real images.</p>



<h2 class="wp-block-heading">Real Example</h2>



<p>We won&#8217;t describe the following in detail, except to note that we are importing various libraries that allow us to download and work with a dataset. The last line creates a function that converts tensors into PIL images:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">import torch
from torch.utils.data import Dataset
from torchvision import datasets
from torchvision.transforms import ToTensor
import matplotlib.pyplot as plt

import torchvision.transforms as T

conv_to_PIL = T.ToPILImage()
</pre>



<p>The following downloads the Caltech 101 dataset, which is a collection of over 8000 images in 101 categories:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">caltech101_data = datasets.Caltech101(
    root="data",
    download=True,
    transform=ToTensor()
)
</pre>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">Extracting data/caltech101/101_ObjectCategories.tar.gz to data/caltech101
Extracting data/caltech101/Annotations.tar to data/caltech101</pre>



<p>This has created a <strong>dataset object</strong> which is a container for the data. These objects can be indexed like lists:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">len(caltech101_data)
# 8677

type(caltech101_data[0])
# tuple

len(caltech101_data[0])
# 2</pre>



<p>The above code shows the dataset contains 8677 items. Looking at the first item of the set we can see they are tuples of 2 items each. Here are the kinds of items in the tuples:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">type(caltech101_data[0][0])
# torch.Tensor

type(caltech101_data[0][1])
# int</pre>



<p>The two items in the tuple are the image as a tensor, and an integer code corresponding to the image&#8217;s category.</p>



<p>Colab has a convenient function <strong><code>display()</code></strong> which will display images. First, we use the conversion function we created earlier to convert our tensors to a <a href="https://blog.finxter.com/pillow-to-convert-image-formats-png-jpg-and-more/" data-type="post" data-id="130706" target="_blank" rel="noreferrer noopener">PIL image</a>, then we display the images.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">img = conv_to_PIL(caltech101_data[0][0])
display(img)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="510" height="337" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-88.png" alt="" class="wp-image-616252" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-88.png 510w, https://blog.finxter.com/wp-content/uploads/2022/08/image-88-300x198.png 300w" sizes="auto, (max-width: 510px) 100vw, 510px" /></figure>
</div>


<p>We can use indexing to sample and display a few other images from the set:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">img = conv_to_PIL(caltech101_data[1234][0])
display(img)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="266" height="300" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-87.png" alt="" class="wp-image-616250"/></figure>
</div>


<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">img = conv_to_PIL(caltech101_data[4321][0])
display(img)</pre>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="266" height="300" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-87.png" alt="" class="wp-image-616251"/></figure>
</div>


<h2 class="wp-block-heading" id="Summary">Summary<a href="file:///C:/Users/xcent/Downloads/Tensors.html#Summary"></a></h2>



<p>We have learned a number of things:</p>



<ol class="wp-block-list">
<li>What tensors are</li>



<li>Why tensors are key mathematical objects for describing and implementing neural networks</li>



<li>Creating tensors in PyTorch</li>



<li>Doing math with tensors in PyTorch</li>



<li>Doing indexing and slicing of tensors in PyTorch, especially to examine images in datasets</li>
</ol>



<p>We hope you have found this article informative. We wish you happy coding!</p>



<p>The next article in the series is the following:</p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f30d.png" alt="🌍" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended Tutorial</strong>: <a href="https://blog.finxter.com/using-pytorch-to-build-a-working-neural-network/" data-type="post" data-id="903641" target="_blank" rel="noreferrer noopener">Using PyTorch to Build a Working Neural Network</a></p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Programmer Humor</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><a href="https://imgs.xkcd.com/comics/computers_vs_humans.png" target="_blank" rel="noreferrer noopener"><img loading="lazy" decoding="async" src="https://blog.finxter.com/wp-content/uploads/2022/06/image-163.png" alt="" class="wp-image-435467" width="578" height="282" srcset="https://blog.finxter.com/wp-content/uploads/2022/06/image-163.png 578w, https://blog.finxter.com/wp-content/uploads/2022/06/image-163-300x146.png 300w" sizes="auto, (max-width: 578px) 100vw, 578px" /></a><figcaption><em>It&#8217;s hard to train deep learning algorithms when most of the positive feedback they get is sarcastic.</em> &#8212; from <a href="https://imgs.xkcd.com/comics/computers_vs_humans.png" data-type="URL" data-id="https://imgs.xkcd.com/comics/computers_vs_humans.png" target="_blank" rel="noreferrer noopener">xkcd</a></figcaption></figure>
</div><p>The post <a href="https://blog.finxter.com/tensors-the-vocabulary-of-neural-networks/">Tensors: The Vocabulary of Neural Networks</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How Neural Networks Learn</title>
		<link>https://blog.finxter.com/how-neural-networks-learn/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Fri, 12 Aug 2022 06:52:25 +0000</pubDate>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=568016</guid>

					<description><![CDATA[<p>Artificial neural networks have become a powerful tool providing many benefits in our modern world. They are used to filter out spam, to perform voice recognition, and are even being developed to drive cars, among many other things. As remarkable as these tools are, they are readily within the grasp of almost anyone. If you ... <a title="How Neural Networks Learn" class="read-more" href="https://blog.finxter.com/how-neural-networks-learn/" aria-label="Read more about How Neural Networks Learn">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/how-neural-networks-learn/">How Neural Networks Learn</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="How Neural Networks Learn" width="937" height="527" src="https://www.youtube.com/embed/x4HH2A23gvE?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p>Artificial neural networks have become a powerful tool providing many benefits in our modern world. They are used to filter out spam, to perform voice recognition, and are even being developed to drive cars, among many other things.</p>



<p>As remarkable as these tools are, they are readily within the grasp of almost anyone. If you have technical interest and have some experience with computer programming you can build your own <a href="https://blog.finxter.com/tutorial-how-to-create-your-first-neural-network-in-1-line-of-python-code/" data-type="post" data-id="2463" target="_blank" rel="noreferrer noopener">neural networks</a>.</p>



<p>But before you learn the hands-on details of building neural networks you should learn some of the fundamentals of how they work. This article will cover one of those fundamentals &#8211; how neural networks learn.</p>



<p class="has-global-color-8-background-color has-background"><strong>Note</strong>: This article includes some algebra and calculus. If you&#8217;re not comfortable with algebra, you should still be able to understand the content from the graphs and descriptions. The calculus is not done in any detail. Again you should still be able to follow along from the descriptions. You will not learn the details of how the calculations are done. Instead, you will gain an intuitive understanding of what is going on.</p>



<p>Before learning this, you should be familiar with the basics of how neural networks are structured and how they operate. The article <a href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/" target="_blank" rel="noreferrer noopener">&#8220;The Magic of Neural Networks: History and Concepts&#8221;</a> covers these basics. Still, we offer the following brief refresher.</p>



<h2 class="wp-block-heading">Basic Fundamentals: How Neural Networks Work</h2>



<p>Figure 1 shows an artificial neuron. </p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="731" height="372" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-19.png" alt="" class="wp-image-568036" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-19.png 731w, https://blog.finxter.com/wp-content/uploads/2022/08/image-19-300x153.png 300w" sizes="auto, (max-width: 731px) 100vw, 731px" /><figcaption><strong>Figure 1</strong>: artificial neuron</figcaption></figure>
</div>


<p>Signals from other neurons come in through multiple inputs, each multiplied by its corresponding <strong>weight</strong> (Weights express the connection strengths between the neuron and each of its upstream neurons.). </p>



<p>A <strong>bias</strong> is input as well (bias expresses a neuron&#8217;s inherent activation, independent of its input from other neurons.). All these inputs add together, and the resulting total signal is then processed through the <strong>activation function</strong> (A sigmoid function is shown here.).</p>



<p></p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1000" height="450" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-20.png" alt="" class="wp-image-568044" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-20.png 1000w, https://blog.finxter.com/wp-content/uploads/2022/08/image-20-300x135.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-20-768x346.png 768w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /><figcaption><strong>Figure 2</strong>: neural network classifying an image (Dog photo by <a href="https://www.pexels.com/photo/shallow-focus-photography-of-a-golden-retriever-686094/" data-type="URL" data-id="https://www.pexels.com/photo/shallow-focus-photography-of-a-golden-retriever-686094/" target="_blank" rel="noreferrer noopener">Garfield Besa</a>)</figcaption></figure>
</div>


<p>Figure 2 shows a network of these neurons. Signals are introduced on the input side, and they progress through the network, passing through neurons and along their connections, getting processed by the calculations described above. How the signals are processed, depends on the weights and biases among all the neurons. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The <strong>key takeaway</strong> is that it is the settings of the weights and biases that establish how the network as a whole computes. In other words, the learning and memory of the network is encoded by the weights and biases.</p>



<p>So how does one program these weights and biases? </p>



<p>They are set by training the network with samples and letting it learn by example. The details of how that is done is the subject of this article.</p>



<h2 class="wp-block-heading">Overview of How Neural Networks Learn</h2>



<p>As mentioned, a neural network&#8217;s learning and memory is encoded by the connection weights and biases of the neurons throughout the network. </p>



<p>These weights and biases are set by training the network on examples by following this six-step training procedure:</p>



<ol class="wp-block-list"><li>Provide a sample to the network.</li><li>Since the network is untrained, it will probably get the wrong answer.</li><li>Compute how far this answer is from the correct answer. This error is known as <strong>loss</strong>.</li><li>Calculate what changes in the weights and biases will make the loss smaller.</li><li>Make adjustments to those weights and biases as determined by those calculations.</li><li>Repeat this again and again with numerous samples until the network learns to answer the samples correctly.</li></ol>



<h2 class="wp-block-heading">Presenting Samples and Calculating Loss</h2>



<p>Let&#8217;s review some of this in more detail while considering a use case. </p>



<p>Imagine we want to train a network to estimate crowd size. </p>



<p>To do this we must first train the network with a large set of images of crowds. For each image the number of people are counted. We then include <strong>labels</strong> indicating correct crowd size for each picture. This is known as a <strong>training set</strong>.</p>



<p>The pictures are submitted to the network, which then indicates its crowd estimate for each picture. Since the network is not trained, it surely gets the estimate wrong for each image. </p>



<p>For each image/label pair, the network calculates the loss for that sample. </p>



<p>Multiple possible choices can be used for calculating loss. One can choose any calculation that appropriately expresses how far the network&#8217;s answer is from the correct answer. </p>



<p>An appropriate choice for crowd-size loss estimate is the square error:</p>


<div class="wp-block-image">
<figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-21-1024x56.png" alt="" class="wp-image-568060" width="847" height="46" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-21-1024x56.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-21-300x16.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-21-768x42.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-21.png 1321w" sizes="auto, (max-width: 847px) 100vw, 847px" /></figure>
</div>


<p>where:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="415" height="107" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-22.png" alt="" class="wp-image-568064" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-22.png 415w, https://blog.finxter.com/wp-content/uploads/2022/08/image-22-300x77.png 300w" sizes="auto, (max-width: 415px) 100vw, 415px" /></figure>
</div>


<p>Suppose we submit an image showing a crowd size of 500 people. Figure 3 shows how the error varies for crowd estimates around the true crowd size of 500 people.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="413" height="266" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-23.png" alt="" class="wp-image-568070" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-23.png 413w, https://blog.finxter.com/wp-content/uploads/2022/08/image-23-300x193.png 300w" sizes="auto, (max-width: 413px) 100vw, 413px" /><figcaption><strong>Figure 3</strong></figcaption></figure>
</div>


<p>If the Network guesses 350 people the loss is 22500. If the network guesses 600 people the loss is 10000. </p>



<p>Clearly, the loss is minimized when the network guesses the correct crowd size of 500 people.</p>



<p>But recall we said it is the weights and biases in the network that encode its learning and memory, so it is the weights and biases that determine if the network gets the right answer. So we need to adjust the weights and biases so that the network gets closer to the correct answer for this image.</p>



<p>In other words, we need to change the weights and biases to minimize the loss. To do that, we need to figure out how the loss varies when we vary the weights and biases.</p>



<h2 class="wp-block-heading">Minimizing Loss: Calculus and the Derivative</h2>



<p>So how do we calculate how loss changes when we vary weights and biases? </p>



<p>This is where calculus comes in.</p>



<p><em>(Don&#8217;t worry if you don&#8217;t know calculus, we&#8217;ll show you everything you need to know, and we&#8217;ll keep it intuitive.) </em></p>



<p>Calculus is all about determining how one variable is affected by changes in another variable.</p>



<p><em>(Strictly speaking there&#8217;s more to calculus than that, but this idea is one of the core ideas of calculus.)</em></p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The loss L depends on network output y, but y depends on input, and on weights w and biases b. So there is a somewhat long and complicated chain of dependencies we have to go through to figure out how L varies when w and b vary. </p>



<p>However, for the sake of learning, let&#8217;s instead start by just examing how L varies when y varies, since this is simpler and will help develop an intuition for calculus.</p>



<p>How L depends on y is somewhat easy &#8211; we saw the equation for it earlier, and we saw the graph of that equation in Figure 3. We can tell by looking at the graph that if the network guesses 350 then we need to increase the output y in order to reduce the loss, and that if the network guesses 600 then we need to decrease the output y in order to reduce the loss.</p>



<p>But with neural networks, we never have the luxury of being able to examine the graph of the loss to figure it out. </p>



<p>We can, however, use calculus to get our answer. To do this, we do what is called <strong>taking the derivative</strong>. </p>



<p>Here is the derivative of the equation for the graph in Figure 3 (note, we will not explain how this is calculated, that is the domain of a calculus course.):</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="70" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-24-1024x70.png" alt="" class="wp-image-568081" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-24-1024x70.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-24-300x20.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-24-768x52.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-24.png 1311w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>This is typically referred to as &#8220;taking the derivative of L with respect to y&#8221;. You can read that <em><strong>dL/dy</strong></em> as saying &#8220;this is how L changes when y changes&#8221;. Now let&#8217;s calculate how L changes when y changes at the point y = 350:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="70" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-25-1024x70.png" alt="" class="wp-image-568084" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-25-1024x70.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-25-300x21.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-25-768x52.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-25.png 1317w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>So at y = 350, for every bit y increases, L decreases by 300. That implies that when we increase y the loss will decrease.</p>



<p>Now let&#8217;s calculate how L changes when y changes at the point y = 600:</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="60" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-26-1024x60.png" alt="" class="wp-image-568087" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-26-1024x60.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/08/image-26-300x18.png 300w, https://blog.finxter.com/wp-content/uploads/2022/08/image-26-768x45.png 768w, https://blog.finxter.com/wp-content/uploads/2022/08/image-26.png 1328w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>So at y = 600, for every bit y increases, L increases by 200. Since we want to <em>decrease</em> L, that means we need to <em>decrease</em> y.</p>



<p>These calculations match what we concluded from looking at the graph.</p>



<p>You can also read <em>dL/dy</em> as saying <em>&#8220;this is the slope of the graph&#8221;</em>. </p>



<p>This makes sense: at point y = 350 the slope of the graph is -300 (sloping down steeply), while at point y = 600 the slope of the graph is 200 (sloping up, not quite so steeply).</p>



<p>So by using calculus and taking the derivative, we can figure out which way to change y to reduce the loss L, even when we can&#8217;t see the graph to figure it out.</p>



<p>Recall, however, that we want to figure out how to change the weights and biases to reduce the loss L. Also recall there is a chain of dependencies, of L depending on y, which itself depends on w and b (for several layers worth of w and b!), and on input. </p>



<p>So a full description could result in some rather complicated equations and some difficult derivatives. For those curious about the math details, the method for figuring out derivatives when there is such dependencies is called <strong>the chain rule</strong>.</p>



<p>Fortunately, with modern neural network software, the computer takes care of calculating derivatives and keeping track of and resolving the chains of dependencies in the derivatives. Just understand that, even if we can&#8217;t see its graph:</p>



<ul class="wp-block-list"><li>there is some relationship between the loss L and the weights w and biases b (a &#8220;graph&#8221;)</li><li>there is some set of weights and biases where the loss L is at a minimum for a given input</li><li>we can use calculus to figure out how to adjust the weights and biases to minimize loss</li></ul>



<h2 class="wp-block-heading">The Loss Surface and Gradient Descent</h2>



<p>Let&#8217;s consider a very simple case where there are just two weights, w1 and w2, and no biases. The graph of L as a function of w1 and w2 might look like figure 4.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="397" height="254" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-27.png" alt="" class="wp-image-568092" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-27.png 397w, https://blog.finxter.com/wp-content/uploads/2022/08/image-27-300x192.png 300w" sizes="auto, (max-width: 397px) 100vw, 397px" /><figcaption><strong>Figure 4</strong>: bowl-shaped error graph</figcaption></figure>
</div>


<p></p>



<p>In this example, with two independent weights, we end up with a bowl-shaped surface for the loss graph. In this case, the loss is minimized when w1 = 4 and w2 = 3. In the beginning, when the network is not yet trained the weights (initially set to small <a rel="noreferrer noopener" href="https://blog.finxter.com/create-a-list-of-random-numbers-the-most-pythonic-way/" data-type="post" data-id="10516" target="_blank">random </a>numbers) are almost certainly not at the correct values for the loss to be at a minimum.</p>



<p>We still figure out which direction to change the weights to reduce the loss by taking the derivative. </p>



<p>Only this time, since there are two independent variables, we take the derivative with respect to each independently. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Important</strong>: The result is, for any given point on the loss surface, a direction (a vector, or an arrow) pointing in which direction the loss increases the fastest (&#8220;uphill&#8221;). This is known as the gradient (instead of derivative). Since we want to reduce loss, we move in the opposite direction, the <em>negative</em> of the gradient.</p>



<p>The larger point is we are still using calculus to figure out which direction to change weights to reduce loss. Repeatedly doing this moves the weights closer to the values which make the network give the correct answer for a given input. This is known as <strong>gradient descent</strong>.</p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended Tutorial</strong>: <a href="https://blog.finxter.com/gradient-descent-in-neural-nets-a-simple-guide-to-ann-learning/" data-type="post" data-id="673142" target="_blank" rel="noreferrer noopener">Gradient Descent in Neural Nets &#8211; A Simple Guide to ANN Learning</a></p>



<p>However, most neural networks have many more than two weights, typically dozens for any given layer. </p>



<p>But the same ideas still apply: if we have a layer consisting of 16 weighted connections, the loss is a 16-dimensional surface! You can&#8217;t visualize it but it still exists mathematically, and the same principles apply! </p>



<p>You can still calculate the gradient, that is the derivative with respect to all 16 w&#8217;s, and figure out which direction to change the w&#8217;s to minimize the loss.</p>



<p>So how much do we adjust the weights and biases? </p>



<p>Typically they are adjusted just a small amount. This is because large adjustments can cause problems. </p>



<p>Refer to the loss surface shown in Figure 4. If too large a step is made, you could jump right across the loss surface bowl, even going so far as to make the loss worse! </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The adjustment step size is known as the <strong>learning rate</strong>. Figuring out the best learning rate is one of the tricks to optimizing your network that a neural network engineer has to work out.</p>



<h2 class="wp-block-heading">Backpropagation</h2>



<p>Ultimately <em>all</em> of the weights and biases throughout the network have to be adjusted to minimize loss. This is done back from the loss, working back layer by layer to the beginning of the network, a process called <strong>backpropagation</strong>. </p>



<p>It has to be done this way because you can&#8217;t figure out how the first layer&#8217;s weights and biases affect loss until you know how the second layer&#8217;s weights and biases affect loss; you can&#8217;t tell how the second layer&#8217;s weights and biases effect loss until you know how the third layer&#8217;s weights and biases effect loss, and so on. </p>



<p>So calculations and adjustments are done starting with the last layer, then working back to the second to the last layer, and so on back to the first layer.</p>



<p>So that&#8217;s the core algorithm of training a neural network:</p>



<ol class="wp-block-list"><li>Present example image.</li><li>Calculate the loss.</li><li>Adjust the network weights and biases through backpropagation, calculating gradient descent, and making adjustments layer by layer.</li></ol>



<h2 class="wp-block-heading">Batch Size</h2>



<p>However, recall that the objective of the training is to adjust the weights and biases for <em>all</em> of the images, not just one. </p>



<p>So how does one train the network, one image at a time, or using the entire set of all training images? Either choice is a possibility. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Ultimately the loss we want to minimize is the loss for the entire set of training samples, so a natural choice might be to run all samples through the network before making adjustments to the weights and biases. This is known as <strong>batch processing</strong>. </p>



<p>However performing so many calculations before making adjustments can be very demanding on computer resources and can slow the training process down.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> How about adjusting weights and biases for each individual training sample? Optimum weights and biases will be different for each training sample, and this variation can introduce large randomness into the gradient descent. This is known as <strong>stochastic gradient descent</strong>.</p>



<p>To better understand the importance of this refer to the hypothetical loss curve in figure 5:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="471" height="282" src="https://blog.finxter.com/wp-content/uploads/2022/08/image-28.png" alt="" class="wp-image-568110" srcset="https://blog.finxter.com/wp-content/uploads/2022/08/image-28.png 471w, https://blog.finxter.com/wp-content/uploads/2022/08/image-28-300x180.png 300w" sizes="auto, (max-width: 471px) 100vw, 471px" /><figcaption><strong>Figure 5</strong>: local and global minimum</figcaption></figure>
</div>


<p></p>



<p>Notice that there is more than one minimum: there is a <strong>local minimum</strong> at point B, which is not quite the lowest loss, and a <strong>global minimum</strong> at point A that is truly the minimum where the loss is lowest. </p>



<p>It is truly possible (even likely) to get loss curves like this, with multiple local minima, and it&#8217;s also possible for the network to get stuck in one of these local minima. </p>



<p>The randomness of single sample training can help knock the network out of a local minimum if it gets stuck in one, so there is some benefit to stochastic gradient descent. </p>



<p>However, the randomness can be so extreme that it can actually knock the network out of the true global minimum if it happens to reach it before a training cycle ends. This can slow the training as the network has to work back down to minimize the loss again.</p>



<p>So in practice, it turns out the best approach is to use <strong>minibatches</strong>. These are batch sizes of perhaps a few hundred samples that are run through the network, and <em>then</em> adjustments are made. </p>



<p>The network runs through mini batch after many batch until the entire set of training samples has been processed. This has enough randomness to it that it has the same benefit as stochastic gradient descent of pushing the network out of local minima, but not so much randomness that the loss can get worse.</p>



<p>Running through the entire set of training samples once is called an <strong>epoch</strong>. </p>



<p>Typically networks must run through many epochs to become fully trained. Also the ordering and grouping of training samples within and between batches is randomized from epoch to epoch. This is to avoid <strong>overfitting</strong>. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Overfitting is when the network performs successfully on the training samples, but fails on samples it has not seen before. This is like a person memorizing a set of samples, rather than generalizing characteritics from those samples so that it can be successful on new samples.</p>



<p>After training the network is then tested on a <strong>test set</strong>. This is a set of samples the network has not seen before. This allows one to assess how well the trained network performs. It checks to see how effective the network is on unknown samples, and checks to make sure overfitting has not occurred.</p>



<h2 class="wp-block-heading">How Neural Networks Learn</h2>



<p>So that is the full process of how neural networks learn:</p>



<ol class="wp-block-list"><li>Train the network by presenting it <strong>minibatches</strong> of samples from the <strong>training set</strong>.</li><li>The training algorithm calculates the <strong>loss</strong> for the minibatch.</li><li>The algorithm calculates the <strong>gradient</strong> of the loss.</li><li>The network adjusts <strong>weights</strong> and <strong>biases</strong> according to the gradient calculations, through the process of <strong>backpropagation</strong> and <strong>gradient descent</strong>.</li><li>Running this sequence through all training samples is called an <strong>epoch</strong>.</li><li>This is then repeated for multiple epochs, until the network is successfully trained on the training set.</li><li>Finally the network is tested on a <strong>test set</strong> to make sure it works successfully and does not suffer from <strong>overfitting</strong>.</li></ol>



<p>We hope you have found this lesson on how neural networks learn informative.</p>



<p>We wish you happy coding!</p>
<p>The post <a href="https://blog.finxter.com/how-neural-networks-learn/">How Neural Networks Learn</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The Magic of Neural Networks: History and Concepts</title>
		<link>https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Wed, 27 Jul 2022 12:22:07 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Deep Learning]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[PyTorch]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=515670</guid>

					<description><![CDATA[<p>Artificial neural networks have become a powerful tool providing many benefits in our modern world. They are used to filter out spam, perform voice recognition, play games, and drive cars, among many other things. As remarkable as these tools are, they are readily within the grasp of almost anyone. If you have technical interest and ... <a title="The Magic of Neural Networks: History and Concepts" class="read-more" href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/" aria-label="Read more about The Magic of Neural Networks: History and Concepts">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/">The Magic of Neural Networks: History and Concepts</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Artificial neural networks have become a powerful tool providing many benefits in our modern world. They are used to filter out spam, perform voice recognition, <a href="https://blog.finxter.com/alphazero-plays-connect-four/" data-type="post" data-id="420308" target="_blank" rel="noreferrer noopener">play games</a>, and drive cars, among many other things.</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="The Magic of Neural Networks: History and Concepts" width="937" height="527" src="https://www.youtube.com/embed/munG19WO8JQ?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p>As remarkable as these tools are, they are readily within the grasp of almost anyone. If you have technical interest and have some experience with computer programming, you can build your own neural networks. Knowledge of some basic algebra and some programming experience is all you need to get started.</p>



<p>And don&#8217;t be afraid to read through this article. Don&#8217;t worry if you don&#8217;t know the algebra &#8211; we have tried to make the text understandable by anyone.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f9e0.png" alt="🧠" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>What You&#8217;ll Learn</strong>: In this article, we will go over the fundamentals of how neural networks are built and how they work. When done, you won&#8217;t yet know how to build them yourself, but you&#8217;ll understand the fundamentals of how they work, which will help you when you get to building your own. </p>



<p>But first we will briefly review a little about real neurons and how this has inspired the development of artificial neural networks.</p>



<p>You can find part 2 of this series in the following tutorial on the Finxter blog:</p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f30d.png" alt="🌍" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Part 2</strong>: <a rel="noreferrer noopener" href="https://blog.finxter.com/how-neural-networks-learn/" data-type="URL" data-id="https://blog.finxter.com/how-neural-networks-learn/" target="_blank">How Do Neural Networks Learn?</a></p>



<h2 class="wp-block-heading" id="A-Little-History-and-Inspiration">A Little History and Inspiration<a href="file:///C:/Users/xcent/Downloads/Magic_of_Neural_Networks.html#A-Little-History-and-Inspiration"></a></h2>



<p>Throughout the history of artificial neural networks, their development has been influenced by research and understanding of how real neurons operate. Let&#8217;s briefly review a simplified understanding of real neurons to provide some inspiration for how artificial neurons might be designed.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="633" height="348" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-95.png" alt="" class="wp-image-515682" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-95.png 633w, https://blog.finxter.com/wp-content/uploads/2022/07/image-95-300x165.png 300w" sizes="auto, (max-width: 633px) 100vw, 633px" /><figcaption><strong>Figure 1</strong>: A biological neuron with synapsis. <a href="http://www.freepik.com">Designed by brgfx / Freepik</a>, labeled by author</figcaption></figure>
</div>


<p>Figure 1 shows a schematic drawing of a real neuron. </p>



<p>The neuron consists of a collection of dendrites, the soma, which is the cell body, and an axon. </p>



<p>Signals come in through the dendrites. The signals are added together within the soma. If the collection of signals is strong enough the neuron will be triggered to send a spike signal down the axon, thereby sending a signal on to other neurons. </p>



<p>Figure 2 shows real neurons connected together in a network.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="500" height="333" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-96.png" alt="" class="wp-image-515686" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-96.png 500w, https://blog.finxter.com/wp-content/uploads/2022/07/image-96-300x200.png 300w" sizes="auto, (max-width: 500px) 100vw, 500px" /><figcaption><strong>Figure 2</strong>: Neuroscience vector <a href="https://www.freepik.com/vectors/neuroscience">created by rawpixel.com &#8211; www.freepik.com</a></figcaption></figure>
</div>


<p>What kind of signals might these neurons convey? </p>



<p>Even though neurons send a spike signal, other research has shown that with more stimulation of the neuron, the spike signals occur more frequently.</p>



<p>This may suggest it&#8217;s actually the frequency of spiking (an analog value) rather than the spike (a digital value) that may be the important signal that neurons convey.</p>



<p>What kind of signal might the neuron finally output? </p>



<p>One can imagine that </p>



<ul class="wp-block-list"><li>with very faint stimulation, a neuron may not output much; </li><li>with modest stimulation, a neuron will output more, perhaps in a linear fashion; and </li><li>with much more stimulation, the neuron may saturate and not be able to output anymore.</li></ul>



<p>This could result in a sigmoid-shaped output, as in Figure 3.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="449" height="266" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-97.png" alt="" class="wp-image-515710" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-97.png 449w, https://blog.finxter.com/wp-content/uploads/2022/07/image-97-300x178.png 300w" sizes="auto, (max-width: 449px) 100vw, 449px" /><figcaption><strong>Figure 3</strong>: Low, mid, and high stimulation output</figcaption></figure>
</div>


<p><br></p>



<p>How might these neurons and networks encode and learn their information? </p>



<p>In 1949 Donald O. Hebb proposed a model for how neuron function might contribute to learning in his book &#8220;The Organization of Behavior&#8221;. He proposed that neural connections are strengthened through use and that this may be the foundation of learning within the brain. </p>



<p>This is sometimes described as &#8220;neurons that fire together wire together&#8221;, and this is known as <strong>Hebbian learning</strong>. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> The implication here is that through learning and use, some neural connections become stronger than others and that it is the pattern of connection strengths that encodes learning and memory.</p>



<p>Understand that further research has shown real neurons to be more complicated than the simple description here. </p>



<p>However, this description does reflect some properties of real neurons, and it turns out even these relatively simple models can exhibit some remarkable behavior.</p>



<h2 class="wp-block-heading" id="Artificial-Neurons-and-Networks">Artificial Neurons and Networks<a href="file:///C:/Users/xcent/Downloads/Magic_of_Neural_Networks.html#Artificial-Neurons-and-Networks"></a></h2>



<p>Now that we&#8217;ve reviewed some simple properties of real neurons and neural networks, let&#8217;s use this simplified understanding of real neurons as inspiration for our design of artificial neurons and networks.</p>



<p>Figure 4 shows our artificial neuron. </p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="731" height="372" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-98.png" alt="" class="wp-image-515715" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-98.png 731w, https://blog.finxter.com/wp-content/uploads/2022/07/image-98-300x153.png 300w" sizes="auto, (max-width: 731px) 100vw, 731px" /><figcaption><strong>Figure 4</strong>: Artificial neuron</figcaption></figure>
</div>


<p>Like with dendrites in real neurons, signals come in from other neurons through multiple inputs. </p>



<h3 class="wp-block-heading">Artificial Neural Network Weights</h3>



<p>The strengths of those connections are expressed by <strong>weights</strong> (w1, w2, etc.) shown on each input. Incoming signals are multiplied by the weights so that larger weights result in stronger signals from that connection, reflecting that stronger connection. All of those signals are added up in the node of the neuron.</p>



<h3 class="wp-block-heading">Artificial Neural Network Bias</h3>



<p>For each neuron, there is also one other signal not connected to any other neurons which are added in, which is called the <strong>bias</strong>. This constant signal determines if that neuron is already enhanced or suppressed on its own, in addition to signals provided by the inputs.</p>



<h3 class="wp-block-heading">Artificial Neural Network Activation Function</h3>



<p>Finally, that total input is passed through a function known as the <strong>activation function</strong>. This function determines how the neuron responds to the activation by its inputs. </p>



<p>There are multiple different functions that are used as activation functions. We have already justified choosing a sigmoid-shaped function, and sigmoid-shaped functions are a common choice. </p>



<p>Though there are multiple other activation functions considered for use, sigmoid-shaped functions are the easiest to motivate from a biological standpoint.</p>



<p>So to reiterate, here is how we describe the signal processing done by each neuron:</p>



<ol class="wp-block-list"><li>Multiply each input by its weight and add them all up.</li><li>Add the bias.</li><li>Process the total through the activation function.</li></ol>



<p>And here is how we describe it mathematically:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="207" height="69" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-99.png" alt="" class="wp-image-515726"/></figure>
</div>


<p class="has-text-align-center"><em>(add up all the weighted inputs, plus bias)</em></p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="152" height="56" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-100.png" alt="" class="wp-image-515730" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-100.png 152w, https://blog.finxter.com/wp-content/uploads/2022/07/image-100-150x56.png 150w" sizes="auto, (max-width: 152px) 100vw, 152px" /></figure>
</div>


<p class="has-text-align-center"><em>(process through the activation function)</em></p>



<p>where:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="368" height="126" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-101.png" alt="" class="wp-image-515735" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-101.png 368w, https://blog.finxter.com/wp-content/uploads/2022/07/image-101-300x103.png 300w" sizes="auto, (max-width: 368px) 100vw, 368px" /></figure>
</div>


<p>There we have it &#8211; this is our artificial neuron. It&#8217;s really quite a simple object: add together weighted inputs, add the bias, and pass that through an activation function for the final output.</p>



<p>This simple object was first introduced by Warren McCulloch and Walter Pitts in their 1943 (!) paper <em>&#8220;A logical calculus of the ideas immanent in nervous activity&#8221;</em>, except their activation function was a step function instead of the smooth sigmoid-shaped function we discussed before. </p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Researcher Frank Rosenblatt called this object the <strong>perceptron</strong> in his 1962 book &#8220;Principles of Neurodynamics&#8221;.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="1000" height="450" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-102.png" alt="" class="wp-image-515741" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-102.png 1000w, https://blog.finxter.com/wp-content/uploads/2022/07/image-102-300x135.png 300w, https://blog.finxter.com/wp-content/uploads/2022/07/image-102-768x346.png 768w" sizes="auto, (max-width: 1000px) 100vw, 1000px" /><figcaption><strong>Figure 5</strong>: Multiple layers of neurons (deep neural network) classifying an image. Dog photo by <a rel="noreferrer noopener" href="https://www.pexels.com/photo/shallow-focus-photography-of-a-golden-retriever-686094/" data-type="URL" data-id="https://www.pexels.com/photo/shallow-focus-photography-of-a-golden-retriever-686094/" target="_blank">Garfield Besa</a></figcaption></figure>
</div>


<p>Figure 5 shows a network of these simple elements integrated together in a multi-layer artificial neural network. This multi-layer network is sometimes called a multi-layer perceptron (abbreviated <strong>MLP</strong>). </p>



<p>Signals come in through the input side, say for example a picture of a cat or a dog. The signals then pass through the network, getting processed by neural calculations along the way. Then on the output side the network provides an output indicating whether the image was a cat or a dog.</p>



<h2 class="wp-block-heading" id="How-Neural-Networks-Are-Programmed">How Neural Networks Are Programmed<a href="file:///C:/Users/xcent/Downloads/Magic_of_Neural_Networks.html#How-Neural-Networks-Are-Programmed"></a></h2>



<p>Much of the field of <a href="https://blog.finxter.com/artificial-intelligence-machine-learning-deep-learning-and-data-science-whats-the-difference/" data-type="post" data-id="4908" target="_blank" rel="noreferrer noopener">artificial intelligence</a> utilized more traditional computer programming methods using <a href="https://blog.finxter.com/21-most-profitable-programming-languages-in-2023/" data-type="post" data-id="404278" target="_blank" rel="noreferrer noopener">programming languages</a> to model intelligence. </p>



<p>These efforts did achieve some compelling results, such as computers that could play competitive checkers or chess. </p>



<p>However, very simple things that even a young child can do eluded computers, things like being able to recognize what object is in a picture. In fact, this seemingly simple task is actually quite difficult to program. Let&#8217;s briefly explore this.</p>



<p>Suppose we want to program the computer to be able to recognize handwritten numerical digits. </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> In fact, this is a common exercise for beginning neural network students; building a network to solve this problem is considered the neural network version of the <em>&#8220;Hello world!&#8221;</em> program, and there&#8217;s a database called <strong>the mnist handwritten digit database</strong> for doing just this very problem. </p>



<p>Figure 6 shows a sample of some of the digits from this database.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="425" height="251" src="https://blog.finxter.com/wp-content/uploads/2022/07/image-103.png" alt="" class="wp-image-515752" srcset="https://blog.finxter.com/wp-content/uploads/2022/07/image-103.png 425w, https://blog.finxter.com/wp-content/uploads/2022/07/image-103-300x177.png 300w" sizes="auto, (max-width: 425px) 100vw, 425px" /><figcaption><strong>Figure 6</strong>: Hand-writing recognition data set.</figcaption></figure>
</div>


<p>Let&#8217;s think about how one might program a computer to do this. Just looking at the numbers you recognize them instantly, but how might you write a program to do that? </p>



<p>Look at the various numbers &#8220;seven&#8221; for example. </p>



<p>Perhaps one could describe it as one horizontal line at the top, and one slanted vertical line below. Would you specify the coordinates where the lines should be? Probably not &#8211; what if the number was written off to the side or in a corner of the image? Could there also be a limit to how long or how short these lines are? </p>



<p>As you can see the number of rules for identifying a number &#8220;seven&#8221; grow quickly and get complicated.</p>



<p>But what about a more sophisticated problem? What about identifying whether a picture is of a cat or a dog?</p>



<p>Not only is there an enormous variety of cats and dogs to distinguish, just figuring out how to describe them so a computer could recognize them is a daunting challenge. Where would one even start?</p>



<p class="has-base-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Instead of writing code to solve this, neural networks are &#8220;programmed&#8221; in a fashion more like how humans learn &#8211; the network is trained with a set of examples, and the network learns from these examples. </p>



<p>More specifically, the network&#8217;s learning is encoded in the weights and biases of the network, and <a href="https://blog.finxter.com/computer-science-research-scientist/" data-type="post" data-id="346253" target="_blank" rel="noreferrer noopener">computer scientists</a> have figured out algorithms that allow the network to automatically self-adjust its weights and biases. </p>



<p>The process is called <strong>back-propogation</strong>. This entails the network adjusting weights and biases to get closer to the correct answer, working back from the output back to the input. </p>



<p>Therefore the programmer does not have to figure out how to encode the solution, the network itself figures it out. </p>



<p>Once the network is trained on a set of examples, new cases can be introduced to the network, and the network provides correct answers </p>



<p><em>(Well, ideally, that is. That is part of the skill of being a <a href="https://blog.finxter.com/deep-learning-engineer-income-and-opportunity/" data-type="post" data-id="307422" target="_blank" rel="noreferrer noopener">neural network engineer</a> &#8211; figuring out how best to build and train networks to get the desired performance.).</em></p>



<p>So what kind of programming does a neural network engineer do? </p>



<ul class="wp-block-list"><li>They write code that describes the structure of the network, such as how many layers, and how many neurons per layer. </li><li>They decide which activation function to choose. </li><li>They also write other code that specifies how to measure errors, known as <strong>loss</strong>, that the network makes. </li><li>They also make choices about what training data to use and how to adjust network learning. </li></ul>



<p>So even though neural networks learn through example, the neural network engineer has much to do to make that happen.</p>



<h2 class="wp-block-heading" id="The-&quot;Magic&quot;-of-Neural-Networks">The &#8220;Magic&#8221; of Neural Networks<a href="file:///C:/Users/xcent/Downloads/Magic_of_Neural_Networks.html#The-&quot;Magic&quot;-of-Neural-Networks"></a></h2>



<p>It is astounding that a network of very simple objects could achieve the seeming human-like ability to learn by example and recognize pictures of objects. </p>



<p>It&#8217;s not obvious up front that a collection such simple objects could achieve this and, there are so many other things they can do: they can locate objects within an image, they can detect words within a conversation, they can help steer cars, among other things. </p>



<p>It is amazing such simple objects could achieve such sophisticated performance. It really truly does seem almost magical.</p>



<p>We hope you have found this article helpful in gaining a basic understanding of how neural networks work. </p>



<p>Even more, we hope this article has fired your imagination and inspired you to learn more about neural networks, even to the point of building them yourself! Go out there and learn how to build some networks!</p>



<p>We wish you happy coding!</p>
<p>The post <a href="https://blog.finxter.com/the-magic-of-neural-networks-how-they-work/">The Magic of Neural Networks: History and Concepts</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Top 4 Jupyter Notebook Alternatives for Machine Learning</title>
		<link>https://blog.finxter.com/survey-of-python-online-notebook-options/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Mon, 25 Apr 2022 12:58:16 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Jupyter]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=324457</guid>

					<description><![CDATA[<p>In this article, we review some of the online options for running Python using online (Jupyter) Notebooks. The Python Landscape There are a number of platforms available for running Python. Some of these include: Install Python on your own machine. Use Jupyter notebooks on your own machine. Use a data science platform like Anaconda on ... <a title="Top 4 Jupyter Notebook Alternatives for Machine Learning" class="read-more" href="https://blog.finxter.com/survey-of-python-online-notebook-options/" aria-label="Read more about Top 4 Jupyter Notebook Alternatives for Machine Learning">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/survey-of-python-online-notebook-options/">Top 4 Jupyter Notebook Alternatives for Machine Learning</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Top 4 Jupyter Notebook Alternatives for Machine Learning" width="937" height="527" src="https://www.youtube.com/embed/1bzWkWc-Zko?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p>In this article, we review some of the online options for running Python using online (Jupyter) Notebooks.</p>



<h2 class="wp-block-heading">The Python Landscape</h2>



<p>There are a number of platforms available for running Python. Some of these include:</p>



<ol class="wp-block-list"><li>Install Python on your own machine.</li><li>Use Jupyter notebooks on your own machine.</li><li>Use a data science platform like Anaconda on your own machine to set up the above.</li><li>Use one of the numerous online Python shells or interpreters or shells.</li><li><strong>Use one of the numerous online Jupyter-Notebook-like online services.</strong></li></ol>



<p>It’s this last option we will review in this article. This is a popular choice in the data science and machine learning fields.</p>



<h2 class="wp-block-heading">Quick Overview of Online Options</h2>



<p>Installing Python on your own machine is perhaps the best approach when writing software. But if you want access to Python online for use anywhere there are a number of available options.</p>



<p>There are a number of sites where you can use an online Python shell, such as<a href="http://www.python.org/shell" target="_blank" rel="noreferrer noopener"> www.python.org/shell</a> for example. </p>



<p>There are also script-based implementations of Python online, such as<a href="https://www.online-python.com/" target="_blank" rel="noreferrer noopener"> https://www.online-python.com/</a>. </p>



<p>But these free offerings are often limited in how much code you can run and how many resources you can use. They are great for learning Python but can be too limited to use for more ambitious needs.</p>



<p>If you want to run some more demanding processes online in data science or machine learning fields an online Jupyter Notebook service is an effective alternative. </p>



<p>Before we review some of those, let&#8217;s review the classic Jupyter Notebook.</p>



<h2 class="wp-block-heading">A Quick Review of Jupyter Notebooks</h2>



<p>When installing and using Python on your own machine you either issue commands in the shell which are executed immediately; or more commonly you write commands in a program file, and then call the interpreter to execute the commands in that file, as a script.</p>



<p>Jupyter Notebooks implement a sort of hybrid version of these two approaches. Jupyter Notebooks are active documents that help an analyst both analyze data and communicate that analysis effectively.</p>



<p>Here are their features and what they do:</p>



<ol class="wp-block-list"><li>Jupyter Notebooks are displayed in a web browser, an interface widely familiar and accessible to all.</li><li>They resemble math and science textbooks, where equations and &nbsp;&nbsp;&nbsp; graphs are mixed within explanatory text which describes the subject matter in question.</li><li>Most significantly the “equation” portions of Jupyter Notebooks consist of code that can be executed, so that the reader can actually run the code to duplicate the analysis. When the code is run the results (numbers or graphs) are displayed below the code.&nbsp;&nbsp;&nbsp;</li><li>In this way they resemble lab notebooks, but where descriptive text is mixed within executable code where the data analysis and experimenting is done.</li></ol>



<p>Jupyter Notebooks are created and edited within a web browser. </p>



<p>When creating a notebook the creator enters content in fields called <strong><em>“cells”</em></strong>. These are simply fields that allow the two kinds of entry, either markdown text or code. </p>



<p>The code cells can be run by hand one at a time, potentially out of order if desired (sort of like the Python shell); or the entire document can be run, cells in order, in a typical script-like manner.</p>



<p>The online services we will review implement the same kind of Jupyter Notebook interface, but provide the service online.</p>



<div class="wp-block-image"><figure class="aligncenter"><img decoding="async" src="https://lh6.googleusercontent.com/GbAeyv8X-MccYH5b_N1CvAmFBYE-DSGyVAtjkBByxOTN4GbDusM8rKr14yyywpqAUO0Jnd3zaLC0CVEXydUelmET3ytbcODg-BHGWuZCvYgkoGZNSn_68fwFK1r2O8rdjmyCwkxv" alt=""/></figure></div>



<p>Classic Jupyter Notebook on a home PC (i.e. not online), with one markdown cell, one code cell with results below it, and one empty cell below that.</p>



<h2 class="wp-block-heading">Advantages to Online Jupyter Notebooks</h2>



<p>There are a number of reasons one might choose to use an online Jupiter Notebook service:</p>



<ol class="wp-block-list"><li>You can run Python anywhere you have a computer and an online connection.&nbsp;&nbsp;&nbsp;</li><li>These platforms typically provide all the <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-use-python-to-analyze-data/" data-type="post" data-id="171639" target="_blank">data analysis</a> and <a rel="noreferrer noopener" href="https://blog.finxter.com/machine-learning-engineer-income-and-opportunity/" data-type="post" data-id="306050" target="_blank">machine learning</a> applications (<a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-quickstart/" data-type="post" data-id="16511" target="_blank">pandas</a>, <a rel="noreferrer noopener" href="https://blog.finxter.com/numpy-tutorial/" data-type="post" data-id="1356" target="_blank">Numpy</a>, <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-scikit-learn-in-python/" data-type="post" data-id="35974" target="_blank">scikit-learn</a>, etc.) which are needed for data analysis and machine learning. Typically most all other Python libraries are available as well.</li><li>Typically they provide systems with high-performing GPUs so that your data processing is fast and efficient. These often implement world-class computing capabilities. This is often essential for machine learning models to be effective and efficient. It is the server that provides the computing power, your own computer just needs to be able to display the webpage.&nbsp;&nbsp;&nbsp;</li><li>They take care of managing the computer system, so you don’t have to. You can be sure you have the computing resources and packages you need, and that they’ll work out of the box. You can focus on using the tools, rather than working on making sure you have a system up to the task. This can be one of the most beneficial aspects: with no effort you can have access to world-class computing resources.</li></ol>



<p>Now that we understand Jupyter Notebooks, and we have seen the reasons one might choose to use an online platform, let&#8217;s review some of them to see what they offer.</p>



<h2 class="wp-block-heading"><strong>Google Colab</strong></h2>



<ul class="wp-block-list"><li>Try it here: <a href="https://colab.research.google.com/" target="_blank" rel="noreferrer noopener">https://colab.research.google.com/</a></li></ul>



<div class="wp-block-image"><figure class="aligncenter size-large"><a href="https://colab.research.google.com/" target="_blank" rel="noopener"><img loading="lazy" decoding="async" width="1024" height="441" src="https://blog.finxter.com/wp-content/uploads/2022/04/image-224-1024x441.png" alt="" class="wp-image-324460" srcset="https://blog.finxter.com/wp-content/uploads/2022/04/image-224-1024x441.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/04/image-224-300x129.png 300w, https://blog.finxter.com/wp-content/uploads/2022/04/image-224-768x331.png 768w, https://blog.finxter.com/wp-content/uploads/2022/04/image-224-1536x662.png 1536w, https://blog.finxter.com/wp-content/uploads/2022/04/image-224.png 1844w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure></div>



<p>Google Colaboratory, or <em>Colab</em> for short, is Google’s implementation of online Jupyter Notebooks.</p>



<h3 class="wp-block-heading"><strong>Features</strong></h3>



<ul class="wp-block-list"><li>Jupyter-like web interface.</li><li>Customizable keystrokes.</li><li>Google colab documents are Jupyter Notebook files, so they can be downloaded and viewed in Classic Jupyter Notebook.</li><li>These files can be saved in Google Drive and Github. If in Google Drive they can be shared with others there.</li><li>Data science packages like <a rel="noreferrer noopener" href="https://blog.finxter.com/pandas-cheat-sheets/" data-type="post" data-id="7977" target="_blank">pandas</a>, etc. are supported with the import command.</li><li>Machine learning packages like <a rel="noreferrer noopener" href="https://blog.finxter.com/scikit-learn-cheat-sheets/" data-type="post" data-id="20549" target="_blank">scikit-learn</a>, etc. are supported with the <code><a href="https://blog.finxter.com/python-__import__-magic-method/" data-type="post" data-id="108900">import</a></code> command.</li><li>Several tutorial notebooks available for training in data science and machine learning.&nbsp;&nbsp;&nbsp;</li><li>Free use of GPU and TPU.&nbsp;&nbsp;&nbsp;</li><li>Unable to support <code>voila</code>. (<code>voila</code> combined with <code>ipywidgets</code> hides code cells so that notebooks can look like a normal GUI application.)</li></ul>



<h3 class="wp-block-heading"><strong>Tiers</strong></h3>



<figure class="wp-block-table is-style-stripes"><table><tbody><tr><td><strong>Colab&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td><td><strong>Colab Pro&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td><td><strong>&nbsp;&nbsp;&nbsp; Colab Pro+&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td></tr><tr><td>free&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>$9.99/month&nbsp;&nbsp;&nbsp; </td><td>49.99/month&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>Faster GPUs and TPUs</td><td>Priority access to faster GPUs and TPUs</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>More memory</td><td>Significantly more memory</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>Longer runtimes</td><td>Even longer runtimes</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>&nbsp;&nbsp;&nbsp;</td><td>Background execution after the browser is closed</td></tr></tbody></table></figure>



<p>The details here are admittedly vague. Google says they are not able to report specifics because they fluctuate, and that they need to maintain that flexibility to maintain their ability to provide free service. </p>



<p>See more details on their FAQ page <a href="https://research.google.com/colaboratory/faq.html#resource-limits" target="_blank" rel="noreferrer noopener">https://research.google.com/colaboratory/faq.html#resource-limits</a>.</p>



<h2 class="wp-block-heading">Paperspace Gradient</h2>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Use machine learning to make anything. Yes. Anything." width="937" height="527" src="https://www.youtube.com/embed/ojPfWlD8E3M?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<ul class="wp-block-list"><li>Learn more: <a rel="noreferrer noopener" href="https://gradient.run/" target="_blank">https://gradient.run/</a></li></ul>



<p>Paperspace is a GPU accelerated cloud computing service. Their Gradient product is dedicated to machine learning.</p>



<h3 class="wp-block-heading"><strong>Features</strong></h3>



<ul class="wp-block-list"><li>Jupyter-like web interface.</li><li>Can switch to full Jupyter Notebook mode within the browser.</li><li>Many available datasets to work with.</li><li>Notebooks publicly visible; private access with paid account.</li><li>Website storage of notebooks. However notebooks can also be downloaded to be run in Classic Jupyter Notebook on a PC.</li><li>Data science packages like pandas, etc. are supported with the import command.&nbsp;&nbsp;&nbsp;</li><li>Machine learning packages like <a rel="noreferrer noopener" href="https://blog.finxter.com/how-to-install-scikit-learn-in-python/" data-type="post" data-id="35974" target="_blank">scikit-learn</a>, etc. are supported with the <code>import</code> command.&nbsp;&nbsp;&nbsp;</li><li>Multiple templates are available pre-configured with notebooks for Jupyter Notebook or various ML platforms.</li><li>Three “entry points”: (1) Notebooks; (2) Workflows, which help automate tasks in creating production-grade systems; (3) Deployments, which assist preparing for production.</li><li>Free use of GPUs.</li><li>Able to support voila because of full Jupyter Notebook support when in the Classic Jupyter Notebook mode.</li></ul>



<h3 class="wp-block-heading"><strong>Tiers</strong></h3>



<figure class="wp-block-table is-style-stripes"><table><tbody><tr><td><strong>Free&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td><td><strong>Pro&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td><td><strong>Growth&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</strong></td></tr><tr><td>free</td><td>$8/month</td><td>$39/month</td></tr><tr><td>Public projects</td><td>Private projects</td><td>Private projects</td></tr><tr><td>5GB storage</td><td>15GB storage</td><td>50GB storage</td></tr><tr><td>Basic instances</td><td>Mid-range instances</td><td>High-end instances</td></tr><tr><td></td><td>Faster free GPUs</td><td>Expert support</td></tr></tbody></table></figure>



<h2 class="wp-block-heading">Kaggle</h2>



<figure class="wp-block-video aligncenter"><video controls src="https://www.kaggle.com/static/video/homepage_landingvideo.mp4"></video><figcaption><a href="https://www.kaggle.com/" target="_blank" rel="noreferrer noopener">source</a></figcaption></figure>



<ul class="wp-block-list"><li><strong>Learn more</strong>: <a rel="noreferrer noopener" href="https://www.kaggle.com/" target="_blank">https://www.kaggle.com/</a></li></ul>



<p>Kaggle is arguably an online community or meeting space for <a href="https://blog.finxter.com/how-much-can-you-earn-as-a-data-science-freelancer/" data-type="post" data-id="16687" target="_blank" rel="noreferrer noopener">data scientists</a> and <a href="https://blog.finxter.com/machine-learning-engineer-income-and-opportunity/" data-type="post" data-id="306050" target="_blank" rel="noreferrer noopener">machine learning</a> people. </p>



<p>As well as providing online notebooks, it includes a newsfeed, datasets, competitions, forums, and free data and machine learning courses, all accessible from a well-organized and intuitive <a href="https://blog.finxter.com/python-dash-how-to-build-a-dashboard/" data-type="post" data-id="19632" target="_blank" rel="noreferrer noopener">dashboard</a>. </p>



<p>Beyond the notebooks, you might want to join this site just because of all the resources it provides.</p>



<h3 class="wp-block-heading"><strong>Features</strong></h3>



<ul class="wp-block-list"><li>Both Jupyter-like web interface and script-like (“normal” program files) interfaces available.&nbsp;&nbsp;&nbsp;</li><li>Notebooks can be downloaded, then opened in Jupyter Notebook &nbsp;&nbsp;&nbsp; elsewhere.&nbsp;&nbsp;&nbsp;</li><li>Many available datasets to work with.&nbsp;&nbsp;&nbsp;</li><li>Data science packages like pandas, etc. are supported with the <code>import</code> command.&nbsp;&nbsp;&nbsp;</li><li>Machine learning packages like scikit-learn, etc. are supported with the import command.&nbsp;&nbsp;&nbsp;</li><li>Multiple free courses on data science and machine learning.&nbsp;&nbsp;&nbsp;</li><li>Free use of GPU and TPU.&nbsp;&nbsp;&nbsp;</li><li>Voila probably not supported.</li></ul>



<h3 class="wp-block-heading"><strong>Tiers</strong></h3>



<p>All Kaggle functions are free to use.</p>



<h2 class="wp-block-heading">JetBrains DataLore</h2>



<div class="wp-block-image"><figure class="aligncenter size-large"><a href="https://datalore.jetbrains.com/" target="_blank" rel="noopener"><img loading="lazy" decoding="async" width="1024" height="725" src="https://blog.finxter.com/wp-content/uploads/2022/04/image-225-1024x725.png" alt="" class="wp-image-324483" srcset="https://blog.finxter.com/wp-content/uploads/2022/04/image-225-1024x725.png 1024w, https://blog.finxter.com/wp-content/uploads/2022/04/image-225-300x212.png 300w, https://blog.finxter.com/wp-content/uploads/2022/04/image-225-768x544.png 768w, https://blog.finxter.com/wp-content/uploads/2022/04/image-225.png 1062w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></a></figure></div>



<ul class="wp-block-list"><li><strong>Learn More</strong>: <a href="https://datalore.jetbrains.com/" target="_blank" rel="noreferrer noopener">https://datalore.jetbrains.com/</a></li></ul>



<p>JetBrains is the company that provides the PyCharm Python IDE. Datalore is their online implementation of Jupyter Notebooks.</p>



<p></p>



<h3 class="wp-block-heading"><strong>Features</strong></h3>



<ul class="wp-block-list"><li>Both Jupyter-like web interface and script-like (“normal” program files) interfaces available. Other modes/features are available as well (see their website for details).</li><li>Notebooks can be downloaded, then opened in Jupyter Notebook &nbsp;&nbsp;&nbsp; elsewhere.&nbsp;&nbsp;&nbsp;</li><li>Data science packages like pandas, etc. are supported with the <code>import</code> command.&nbsp;&nbsp;&nbsp;</li><li>Machine learning packages like scikit-learn, etc. are supported with the <code>import</code> command.</li><li>Well-written and easy to use help documentation.</li><li>Free CPU use; GPU use with paid tier.</li><li>Voila is available as a package.</li></ul>



<h3 class="wp-block-heading"><strong>Tiers</strong></h3>



<figure class="wp-block-table is-style-stripes"><table><tbody><tr><td><strong>Community</strong></td><td><strong>Professional</strong></td></tr><tr><td>Free</td><td>$19.90/month&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td></tr><tr><td>120 hours of computations on a basic CPU machine</td><td>Unlimited computations on a basic CPU machine</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>120 hours of computations on a powerful CPU machine</td></tr><tr><td>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;</td><td>20 hours of computation on a GPU machine</td></tr><tr><td>10 GB of cloud storage + S3 bucket support</td><td>20 GB of cloud storage + S3 bucket support</td></tr><tr><td>Keep machine running for 6 hours after you’ve left notebook</td><td>Keep machine running for unlimited time</td></tr></tbody></table></figure>



<h2 class="wp-block-heading">Conclusion</h2>



<p>Online Jupyter Notebooks can be a valuable resource for Python computing anywhere, and ensure you have access to world-class resources for your computing. </p>



<p>To give you an idea of what is available we have reviewed a small sample of some of those resources. </p>



<p>However, this is just the tip of the iceberg of what is available. See this article for a much larger list of other available sites:</p>



<ul class="wp-block-list"><li><a href="https://www.topbestalternatives.com/google-colab/" target="_blank" rel="noreferrer noopener">https://www.topbestalternatives.com/google-colab/</a></li></ul>



<p>And this review is also only the tip of the iceberg of what these sites offer. </p>



<p>If this is something that interests you, definitely go to their sites to see what they offer; and since most have free options, try them out to see which you like best, and which best meets your Python, data science, or machine learning needs. </p>



<p>Also note this is a snapshot of offerings as of April 2022. This can be a fast-changing field, so examining the offerings yourself is highly encouraged to see what the latest changes are.</p>



<p>We wish you happy coding!</p>
<p>The post <a href="https://blog.finxter.com/survey-of-python-online-notebook-options/">Top 4 Jupyter Notebook Alternatives for Machine Learning</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		<enclosure url="https://www.kaggle.com/static/video/homepage_landingvideo.mp4" length="0" type="video/mp4" />

			</item>
		<item>
		<title>The Ultimate Guide to Installing Ghostscript</title>
		<link>https://blog.finxter.com/the-ultimate-guide-to-installing-ghostscript/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Sat, 02 Apr 2022 12:45:59 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[macOS]]></category>
		<category><![CDATA[Scripting]]></category>
		<category><![CDATA[Windows]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=277912</guid>

					<description><![CDATA[<p>In this article we explore how to install Ghostscript on numerous different platforms and operating systems. What is Ghostcript? Why install it? What is Ghostscript, and why would we want to install it? To understand this we should first learn about Postscript. Postscript Postscript is a page description language geared towards desktop publishing documents. If ... <a title="The Ultimate Guide to Installing Ghostscript" class="read-more" href="https://blog.finxter.com/the-ultimate-guide-to-installing-ghostscript/" aria-label="Read more about The Ultimate Guide to Installing Ghostscript">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/the-ultimate-guide-to-installing-ghostscript/">The Ultimate Guide to Installing Ghostscript</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>In this article we explore how to install Ghostscript on numerous different platforms and operating systems.</p>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Ghostscript - The Ultimate Guide to Getting Started with &amp; Install Ghostscript" width="937" height="527" src="https://www.youtube.com/embed/ZUfup9bflWw?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<h2 class="wp-block-heading">What is Ghostcript? Why install it?</h2>



<p>What is Ghostscript, and why would we want to install it? To understand this we should first learn about Postscript.</p>



<h3 class="wp-block-heading">Postscript</h3>



<p>Postscript is a page description language geared towards desktop publishing documents.</p>



<p><em>If you want really professional-looking typesetting, layout, and graphics in your documents, desktop publishing software is what you use. </em></p>



<p>It was first created at Adobe Systems starting in 1982. As a language, it is similar to Python in that documents contain human-readable and writable commands in the language that can be parsed by an interpreter to get something done. </p>



<p>In the case of Python, text files containing Python commands can be parsed by the Python interpreter to create any kind of program imaginable. </p>



<p>In the case of Postscript, files containing Postscript commands can be parsed by a Postscript interpreter to render professional-looking documents, either to the screen or to a printer. </p>



<p>In addition, the PDF format is an extension of the Postscript language which adds more functionality and is now one of the most commonly used document formats.</p>



<h3 class="wp-block-heading">Ghostscript</h3>



<p>Ghostscript is a free open-source interpreter to render Postscript and PDF documents. </p>



<p>One of the reasons you might want to install it is to use a program that requires it. </p>



<p>Even without a program that needs it, installing Ghostscript can be useful: </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/2b50.png" alt="⭐" class="wp-smiley" style="height: 1em; max-height: 1em;" /> Ghostscript can be used to modify PDF documents, such as converting PDF to images, or extracting text, among other things. </p>



<p>Even better, since Ghostscript provides a language-binding API, Ghostscript functions can be implemented in other languages, allowing us to write our own programs for modifying PDF documents. Supported languages are C#, Java, and Python.</p>



<h2 class="wp-block-heading">Checking if Ghostscript is Already Installed</h2>



<p>You may already have Ghostscript installed &#8211; your system may have come with it, or it may have been installed in support of a program you have installed. So save yourself some effort and check first.</p>



<h3 class="wp-block-heading">Checking for Ghostscript on Windows</h3>



<ol class="wp-block-list"><li>Press <code>Windows+R</code> to open the “<code>Run</code>” box.&nbsp;&nbsp;&nbsp;</li><li>In the “<code>Run</code>” box type “<code>cmd</code>”.&nbsp;&nbsp;&nbsp;</li><li>A command line window opens.&nbsp;&nbsp;&nbsp;</li><li>In the command line window type “<code>GSWIN64 -h</code>” if your system is 64 bit (most machines these days), or “<code>GSWIN32 -h</code>” if your system is 32 bit (older machines). If Ghostscript is installed you will see Ghostscript help information. If you see an error then Ghostscript is not installed.&nbsp;&nbsp;&nbsp;</li><li>Type “<code>exit</code>” to close the command line window.<br></li></ol>



<h3 class="wp-block-heading">Checking for Ghostscript on Mac</h3>



<ol class="wp-block-list"><li>In the Finder, open the <code>/Applications/Utilities</code> folder, then double-click <code>Terminal</code>.&nbsp;&nbsp;&nbsp;</li><li>In the terminal window type “<code>gs -h</code>”. If Ghostscript is installed you will see Ghostscript help information. If you see an error then Ghostscript is not installed. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</li><li>In the <code>Terminal</code> app on your Mac, choose <code>Terminal &gt; Quit Terminal</code>.<br></li></ol>



<h3 class="wp-block-heading">Checking for Ghostscript on Linux</h3>



<ol class="wp-block-list"><li>Open a terminal window. How to do this varies depending on which distribution of Linux you are using.&nbsp;&nbsp;&nbsp;</li><li>In the terminal window type “<code>gs -h</code>”. If Ghostscript is installed you will see Ghostscript help information. If you see an error then Ghostscript is not installed.<br></li></ol>



<h2 class="wp-block-heading">Installing Ghostscript on Windows</h2>



<ol class="wp-block-list"><li>Go to the Ghostscript download page at <a rel="noreferrer noopener" href="https://www.ghostscript.com/releases/gsdnld.html" target="_blank">https://www.ghostscript.com/releases/gsdnld.html</a> &nbsp;&nbsp;&nbsp;</li><li>There are two license versions available: Affero GPL (AGPL), and commercial. Review the license information at <a rel="noreferrer noopener" href="https://artifex.com/licensing/" target="_blank">https://artifex.com/licensing/</a>. For casual use most users will chose AGPL.&nbsp;&nbsp;&nbsp;</li><li>Choose 64 bit or 32 bit depending on your system.&nbsp;&nbsp;&nbsp;</li><li>Download your choice by clicking on the chosen link.&nbsp;&nbsp;&nbsp;</li><li>The installer program will download.&nbsp;&nbsp;&nbsp;</li><li>The downloaded program will be <code>gsxxxxw64.exe</code> or <code>gsxxxxw32.exe</code>. The ‘<code>xxxx</code>’ will be numbers indicating the release version. The most current version as of this writing is <code>9.55.0</code>, so the installer program would be <code>gs9550w64.exe</code> for the 64 bit version.&nbsp;&nbsp;&nbsp;</li><li>Double-click the downloaded installer program.&nbsp;&nbsp;&nbsp;</li><li>Follow the prompts to do the installation.<br></li></ol>



<h2 class="wp-block-heading">Installing Ghostscript on Unix</h2>



<p>Use this for any UNIX-based machine, so this should work for Mac or Linux.</p>



<p>Most UNIX systems have much easier ways of installing Ghostscript, so you will almost certainly not need to do this. </p>



<p>However, if you have trouble with those easier approaches you might try this as a backup. </p>



<p>This method usually works, but sometimes it does not, and then you need to do some troubleshooting to figure out why (the configure file might not be configured properly for your system, for example). </p>



<p>Also note that you will need to make sure that compiling software for Linux or Mac is installed on your system, which is beyond the scope of this article. So choose this approach as a last resort.</p>



<p></p>



<ol class="wp-block-list"><li>Go to the Ghostscript download page and download the source code version. As of this writing this file is ghostscript-9.55.0.tar.gz&nbsp;&nbsp;&nbsp;</li><li>Move this file to some folder where you want to work.&nbsp;&nbsp;&nbsp;</li><li>Unarchive the downloaded file. Usually your system will be configured to do so by double-clicking the file. If not, you can unarchive using this command in the terminal: <code>tar -xzf ghostscript-9.55.0.tar.gz</code>. The file will unpack into sub-directories and files.&nbsp;&nbsp;&nbsp;</li><li>In the terminal go to the top unpacked sub-directory.&nbsp;&nbsp;&nbsp;</li><li>Run the configure file by typing <code>./configure</code> in your terminal. This will review your system and get ready to compile the code.&nbsp;&nbsp;&nbsp;</li><li>Compile the code by typing <code>make</code> in your terminal.&nbsp;&nbsp;&nbsp;</li><li>Install the compiled code by typing this: <code>sudo make install</code><br></li></ol>



<p>Here are the commands for ease of copy&amp;paste:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">tar -xzf ghostscript-9.55.0.tar.gz
./configure
make
sudo make install</pre>



<h2 class="wp-block-heading">Installing Ghostscript on Mac</h2>



<p>The easiest way to install Ghostscript on Mac is to use the <strong>Homebrew</strong> or <strong>Macports</strong> systems. These are package management systems for Mac that make available to the Mac the wide world of Unix open-source software. </p>



<p>In these systems, much of the configuring is done for you by others so that downloading and installing software is as easy as a single command, just like downloading an app for the Mac is as simple as clicking an icon in the Mac App Store. </p>



<p>What programs are available depends on what has been prepared by others for the system. </p>



<p>Fortunately, Ghostscript is available for these systems. </p>



<p>Setting up these systems is beyond the scope of this article. This<a href="https://www.scivision.dev/homebrew-macports-fink/" target="_blank" rel="noreferrer noopener"> page</a> has a nice summary of those systems (and of the Fink system, another package management system). Follow their respective links to learn more about each system.</p>



<p>Install Ghostscript using Homebrew using the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">brew install ghostscript</pre>



<p>Install Ghostscript using Macports using the following command:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo port install ghostscript</pre>



<h2 class="wp-block-heading">Installing Ghostscript on Ubuntu</h2>



<p>It is often most intuitive to install software on Ubuntu using the GUI-based software application. </p>



<p>This accesses the repositories of extensive software available for Ubuntu. </p>



<p>However, it is often the fastest to do a command line install. Do so for Ghostscript as follows:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo apt install ghostscript</pre>



<h2 class="wp-block-heading">Installing Ghostscript on Other Debian-based Distributions</h2>



<p>There are many distributions that, like Ubuntu, are based on Debian. </p>



<p>Many also have GUI applications for installing software, and often these can be used to install Ghostscript. But like Ubuntu, it is often the fastest to use the command line install. </p>



<p>The command is still the same:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo apt install ghostscript</pre>



<h2 class="wp-block-heading">Installing Ghostscript on Centos 7, and Other Red Hat/ Fedora-based Distributions</h2>



<p>Centos 7 is a free version of the Red Hat Linux distribution, without Red Hat branding or technical support from Red Hat. </p>



<p>Fedora is the “bleeding-edge” freely available distribution in the Red Hat family of distributions that serves as the development foundation for the more robust and stable Red Hat distribution. </p>



<p>Since these are all in the same distribution family, they are all most quickly updated by the same command. The many other distributions in this family are also most quickly updated by the same command. </p>



<p>The command is:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">sudo yum install ghostscript</pre>



<h2 class="wp-block-heading">Installing Ghostscript for Anaconda</h2>



<p>If you are a data scientist more comfortable with data analysis in Anaconda than you are comfortable with OS management, you can still make sure you have ghostscript through Anaconda. </p>



<p>Open the Anaconda command line interface and enter the following command to install Ghostscript:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">conda install -c conda-forge ghostscript</pre>



<h2 class="wp-block-heading">Installing Ghostscript in Google Colab</h2>



<p>Ghostscript can even be installed in Google Colab. </p>



<p>Cells in Colab are in-effect like the Python shell. Therefore users can use the exclamation mark to submit OS shell commands, then enter the command to install Ghostscript. </p>



<p>The OS behind Colab operates like Ubuntu, so the installation command mirrors that of Ubuntu. Therefore, to install Ghostscript enter the following command in a Colab cell:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">!apt get install ghostscript</pre>



<h2 class="wp-block-heading">Conclusion</h2>



<p>Ghostscript is a free open-source interpreter that renders Postscript and PDF documents either to the screen or to a printer. </p>



<p>Ghostscript can also be used to process or modify such documents. </p>



<p>Even better, because Ghostscript includes a language-binding API, programmers can use it to write programs in other languages to modify PDF documents. </p>



<p>Supported languages are <a href="https://blog.finxter.com/c-developer-income-and-opportunity/" data-type="post" data-id="189360" target="_blank" rel="noreferrer noopener">C#</a>, <a href="https://blog.finxter.com/java-developer-income-and-opportunity/" data-type="post" data-id="217907" target="_blank" rel="noreferrer noopener">Java</a>, and <a href="https://blog.finxter.com/python-developer-income-and-opportunity/" data-type="post" data-id="189354" target="_blank" rel="noreferrer noopener">Python</a>.</p>



<p>As you can see, Ghostscript is available on many different platforms and operating systems. We have exhibited commands to install Ghostscript on many of these various platforms. </p>



<p>We hope you have found this helpful, and we wish you happy coding!</p>



<hr class="wp-block-separator"/>



<p>The post <a href="https://blog.finxter.com/the-ultimate-guide-to-installing-ghostscript/">The Ultimate Guide to Installing Ghostscript</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Compress PDF Files Using Python?</title>
		<link>https://blog.finxter.com/how-to-compress-pdf-files-using-python/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Tue, 22 Mar 2022 20:07:28 +0000</pubDate>
				<category><![CDATA[Operating System]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Scripting]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=256441</guid>

					<description><![CDATA[<p>Problem Formulation Suppose you have a PDF file, but it’s too large and you’d like to compress it (perhaps you want to reduce its size to allow for faster transfer over the internet, or perhaps to save storage space).&#160; Even more challenging, suppose you have multiple PDF files you’d like to compress.&#160; Multiple online options ... <a title="How to Compress PDF Files Using Python?" class="read-more" href="https://blog.finxter.com/how-to-compress-pdf-files-using-python/" aria-label="Read more about How to Compress PDF Files Using Python?">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/how-to-compress-pdf-files-using-python/">How to Compress PDF Files Using Python?</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="How to Compress PDF Files Using Python?" width="937" height="527" src="https://www.youtube.com/embed/c4mlg-_jS-g?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<h2 class="wp-block-heading">Problem Formulation</h2>



<p>Suppose you have a PDF file, but it’s too large and you’d like to compress it (perhaps you want to reduce its size to allow for faster transfer over the internet, or perhaps to save storage space).&nbsp; </p>



<p>Even more challenging, suppose you have multiple PDF files you’d like to compress.&nbsp; </p>



<p>Multiple online options exist, but these typically allow a limited number of files to be processed at a time.&nbsp; Also of course there is the extra time involved in uploading the originals, then downloading the results.&nbsp; And of course, perhaps you are not comfortable sharing your files with the internet.</p>



<p>Fortunately, we can use Python to address all these concerns.&nbsp; But before we learn how to do this, let’s first learn a little bit about PDF files.</p>



<h2 class="wp-block-heading">About Compressing PDF Files</h2>



<p>According to Dov Isaacs, former Adobe Principal Scientist (see his discussion <a href="https://community.adobe.com/t5/acrobat-discussions/compressing-pdf/td-p/10950834" target="_blank" rel="noreferrer noopener">here</a>) PDF documents are already substantially compressed.&nbsp; </p>



<p>The text and vector graphics portions of the documents are already internally zip-compressed, so there is little opportunity for improvement there.&nbsp; </p>



<p>Instead, any file compression improvements are achieved through compression of image portions of PDF documents, along with potential loss of image quality.&nbsp; </p>



<p>So compression might be achievable, but the user must choose between how much compression versus how much image quality loss is acceptable.</p>



<h2 class="wp-block-heading">Setup</h2>



<p>A programmer going by the handle <em>Theeko74</em> has written a Python script called “<code>pdf_compressor.py</code>”. This script is a wrapper for <code>ghostscript</code> functions that do the actual work of compressing PDF files.&nbsp; </p>



<p>This script is offered under the MIT license and is free to use as the user wishes.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Hint</strong>: make sure you have <code>ghostscript</code> installed on your computer. To install <code>ghostscript</code>, follow <a rel="noreferrer noopener" href="https://web.mit.edu/ghostscript/www/Install.htm" data-type="URL" data-id="https://web.mit.edu/ghostscript/www/Install.htm" target="_blank">this detailed guide</a> and come back afterward.</p>



<p>Now download <code>pdf_compressor.py</code> from GitHub <a rel="noreferrer noopener" href="https://github.com/theeko74/pdfc/blob/master/pdf_compressor.py" target="_blank">here</a>.</p>



<ul class="wp-block-list">
<li>URL: <a href="https://github.com/theeko74/pdfc/blob/master/pdf_compressor.py" target="_blank" rel="noreferrer noopener">https://github.com/theeko74/pdfc/blob/master/pdf_compressor.py</a></li>
</ul>



<p>Ultimately we will be writing a Python script to perform the compression.&nbsp; </p>



<p>So we create a directory to hold the script, and use our preferred editor or <a href="https://blog.finxter.com/best-python-ide/" data-type="post" data-id="8106" target="_blank" rel="noreferrer noopener">IDE</a> to create it (this example uses Linux command line to make the directory, and uses <code><a href="https://blog.finxter.com/how-to-edit-a-text-file-in-windows-powershell/" data-type="post" data-id="236823" target="_blank" rel="noreferrer noopener">vim</a></code> as the editor to make script “<code>bpdfc.py</code>”; use your preferred choice for creating the directory and creating the script within it):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">$ mkdir batchPDFcomp
$ cd batchPDFcomp
$ vim bpdfc.py</pre>



<p>We won’t write out the script just yet &#8211; we’ll show some details for the script a little later in this article.</p>



<p>When we do write the script, within it we’ll import “<code>pdf_compressor.py</code>” as a <a href="https://blog.finxter.com/python-how-to-import-modules-from-another-folder/" data-type="post" data-id="19786" target="_blank" rel="noreferrer noopener">module</a>.&nbsp; </p>



<p>To prepare for this we should create a subdirectory below our Python script directory.&nbsp; </p>



<p>Also, we’ll need to copy <code>pdf_compressor.py</code> into that subdirectory, and we’ll need to create a file <code><a href="https://blog.finxter.com/python-init/" data-type="post" data-id="5133" target="_blank" rel="noreferrer noopener">__init__.py</a></code> within the same subdirectory (those are double underscores each side of ‘<code>init</code>’):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">$ mkdir pdfc
$ cp ~/Downloads/pdf_compressor.py ~/batchPDFcomp/pdfc/
$ cd pdfc
$ vim __init__.py</pre>



<p>What we have done here is created a local package <code>pdfc</code> containing a module <code>pdf_compressor.py</code>.&nbsp; </p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: The presence of file <code>__init__.py</code> indicates to Python that that directory is part of a package, and to look there for modules.</p>



<p>Now we are ready to write our script.</p>



<h2 class="wp-block-heading">The PDF Compression Python Script</h2>



<p>Here is our script:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">from pdfc.pdf_compressor import compress
compress('Finxter_WorldsMostDensePythonCheatSheet.pdf', 'Finxter_WorldsMostDensePythonCheatSheet_compr.pdf', power=4)</pre>



<p>As you can see it’s a very short script.&nbsp; </p>



<p>First we import the “<code>compress</code>” function from “<code>pdf_compressor</code>” module.&nbsp; </p>



<p>Then we call the “<code>compress</code>” function.&nbsp; The function takes as arguments: the input file path, the output file path, and a ‘<code>power</code>’ argument that sets compression as follows, from <strong><em>least</em></strong> compression to <strong><em>most </em></strong>(according to the documentation in the script):</p>



<p>Compression levels:</p>



<ul class="wp-block-list">
<li><code>0: default</code></li>



<li><code>1: prepress</code></li>



<li><code>2: printer</code></li>



<li><code>3: ebook</code></li>



<li><code>4: screen</code></li>
</ul>



<h2 class="wp-block-heading">Running the Script</h2>



<p>Now we can run our script:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">$  python bpdfc.py
Compress PDF...
Compression by 51%.
Final file size is 0.2MB
Done.
$ </pre>



<p>We have only compressed one PDF document in this example, but by modifying the script to loop through multiple PDF documents one can compress multiple files at once.&nbsp; </p>



<p>However, we leave that as an exercise for the reader!</p>



<p>We hope you have found this article useful. Thank you for reading, and we wish you happy coding!</p>



<p class="has-base-background-color has-background"> <img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f449.png" alt="👉" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Recommended Tutorial</strong>: <a rel="noreferrer noopener" href="https://blog.finxter.com/compress-images-python/" data-type="URL" data-id="https://blog.finxter.com/compress-images-python/" target="_blank">How to Compress Images in Python</a></p>
<p>The post <a href="https://blog.finxter.com/how-to-compress-pdf-files-using-python/">How to Compress PDF Files Using Python?</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Mutable vs. Immutable Objects in Python</title>
		<link>https://blog.finxter.com/mutable-vs-immutable-objects-in-python/</link>
		
		<dc:creator><![CDATA[Aaron Glatzer]]></dc:creator>
		<pubDate>Tue, 22 Feb 2022 10:38:49 +0000</pubDate>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Object Orientation]]></category>
		<category><![CDATA[Python]]></category>
		<guid isPermaLink="false">https://blog.finxter.com/?p=204090</guid>

					<description><![CDATA[<p>Overview: Mutable objects are Python objects that can be changed. Immutable objects are Python objects that cannot be changed. The difference originates from the fact the reflection of how various types of objects are actually represented in computer memory. Be aware of these differences to avoid surprising bugs in your programs. Introduction To be proficient ... <a title="Mutable vs. Immutable Objects in Python" class="read-more" href="https://blog.finxter.com/mutable-vs-immutable-objects-in-python/" aria-label="Read more about Mutable vs. Immutable Objects in Python">Read more</a></p>
<p>The post <a href="https://blog.finxter.com/mutable-vs-immutable-objects-in-python/">Mutable vs. Immutable Objects in Python</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Mutable vs. Immutable Objects in Python" width="937" height="527" src="https://www.youtube.com/embed/FSE3Jp6hbss?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
</div></figure>



<p id="overview"><strong>Overview</strong>:</p>



<ul class="wp-block-list"><li><strong>Mutable objects are Python objects that can be changed.</strong></li><li><strong>Immutable objects are Python objects that cannot be changed.</strong></li><li><strong>The difference originates from the fact the reflection of how various types of objects are actually represented in computer memory.</strong></li><li><strong>Be aware of these differences to avoid surprising bugs in your programs.</strong></li></ul>



<h2 class="wp-block-heading" id="introduction">Introduction</h2>



<p>To be proficient a <a href="https://blog.finxter.com/python-crash-course/" data-type="post" data-id="3951" target="_blank" rel="noreferrer noopener">Python programmer</a> must master a number of skills.&nbsp; Among those is an understanding of the notion of <strong><em>mutable vs immutable objects</em></strong>.&nbsp; This is an important subject, as without attention to it programmers can create unexpected and subtle bugs in their programs.</p>



<p>As described above, at its most basic mutable objects can be changed, and immutable objects cannot be changed.&nbsp; This is a simple description, but for a proper understanding, we need a little context.&nbsp; Let&#8217;s explore this in the context of the Python data types.</p>



<h2 class="wp-block-heading" id="mutable-vs-immutable-data-types">Mutable vs. Immutable Data Types</h2>



<p>The first place a programmer is likely to encounter mutable vs. immutable objects is with the Python data types.&nbsp; </p>



<p>Here are the most common data types programmers initially encounter, and whether they are <strong>mutable</strong> or <strong>immutable</strong> (this is not a complete list; Python does have a few other data types):</p>



<figure class="wp-block-table is-style-stripes"><table><tbody><tr><td><strong>Data type</strong></td><td><strong>Mutable or Immutable?</strong></td></tr><tr><td><code><a href="https://blog.finxter.com/python-int-function/" data-type="post" data-id="22715" target="_blank" rel="noreferrer noopener">int</a></code></td><td>immutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-float-function/" data-type="post" data-id="22782" target="_blank" rel="noreferrer noopener">float</a></code></td><td>immutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-str-function/" data-type="post" data-id="23735" target="_blank" rel="noreferrer noopener">str</a></code></td><td>immutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-list/" data-type="post" data-id="21502" target="_blank" rel="noreferrer noopener">list</a></code></td><td>mutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-tuple/" data-type="post" data-id="21575" target="_blank" rel="noreferrer noopener">tuple</a></code></td><td>immutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-dict/" data-type="post" data-id="19866" target="_blank" rel="noreferrer noopener">dict</a></code></td><td>mutable</td></tr><tr><td><code><a href="https://blog.finxter.com/python-bool/" data-type="post" data-id="17841" target="_blank" rel="noreferrer noopener">bool</a></code></td><td>immutable</td></tr></tbody></table></figure>



<p>Let&#8217;s experiment with a few of these in the Python shell and observe their mutability/immutability.&nbsp;&nbsp;</p>



<p>First let&#8217;s experiment with the <a href="https://blog.finxter.com/python-lists/" data-type="post" data-id="7332" target="_blank" rel="noreferrer noopener">list</a>, which should be mutable.&nbsp; We&#8217;ll start by <a href="https://blog.finxter.com/how-to-create-a-python-list/" data-type="post" data-id="10436" target="_blank" rel="noreferrer noopener">creating a list</a>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1 = ['spam', 'eggs']</pre>



<p>Now let&#8217;s try changing the list using a <a href="https://blog.finxter.com/python-slice-assignment/" data-type="post" data-id="1942" target="_blank" rel="noreferrer noopener">slicing assignment</a>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1[0] = 'toast'</pre>



<p>Now let&#8217;s view our list and see if it has changed:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1
['toast', 'eggs']</pre>



<p>Indeed, it has.</p>



<p>Now let&#8217;s experiment with <strong>integers</strong>, which should be <strong>immutable</strong>.&nbsp; We&#8217;ll start by <a href="https://blog.finxter.com/python-in-place-assignment-operators/" data-type="post" data-id="33217" target="_blank" rel="noreferrer noopener">assigning</a> an integer to our variable:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_int1 = 3
>>> our_int1
3
</pre>



<p>Now let&#8217;s try changing it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_int1 = 42
>>> our_int1
42</pre>



<p>It changed.&nbsp;If you&#8217;ve already worked with Python this should not surprise you.&nbsp; </p>



<p><em>So in what sense is an integer immutable?&nbsp; What&#8217;s going on here?&nbsp; What do the Python language designers mean they claim integers are immutable?</em></p>



<p>It turns out the two cases are actually different.</p>



<ul class="wp-block-list"><li>In the case of the list, the variable still contains the original list but the list was modified.&nbsp; </li><li>In the case of the integer, the original integer was completely removed and replaced with a new integer.</li></ul>



<p>While this may seem intuitive in this example, it&#8217;s not always quite so clear as we&#8217;ll see later.&nbsp; </p>



<p>Many of us start out understanding variables as containers for data. The reality, where data is stored in memory, is a little more complicated.&nbsp; </p>



<p>The Python <code><a href="https://blog.finxter.com/python-id-function/" data-type="post" data-id="24087" target="_blank" rel="noreferrer noopener">id()</a></code> function will help us understand that.&nbsp;&nbsp;</p>



<h2 class="wp-block-heading" id="looking-under-the-hood-the-id-function">Looking Under the Hood: the id() Function</h2>



<p>The common understanding of variables as containers for data is not quite right.&nbsp; In reality variables contain references to where the data stored, rather than the actual data itself.</p>



<p>Every object or data in Python has an <strong><em>identifier</em></strong> integer value, and the <code>id()</code> function will show us that identifier (id).</p>



<p>In fact, that <em>id</em> is the (virtualized) memory location where that data is stored.&nbsp; </p>



<p>Let&#8217;s try our previous examples and use the <code>id()</code> function to see what is happening in memory</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f6d1.png" alt="🛑" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: be aware that if you try this yourself your memory locations will be different.</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1 = ['spam', 'eggs']
>>> id(our_list1)
139946630082696
</pre>



<p>So there&#8217;s a list at memory location <code>139946630082696</code>.&nbsp; </p>



<p>Now let&#8217;s change the list using a <a href="https://blog.finxter.com/python-slice-assignment/" data-type="post" data-id="1942" target="_blank" rel="noreferrer noopener">slicing assignment</a>:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1[0] = 'toast'
>>> our_list1
['toast', 'eggs']
>>> id(our_list1)
139946630082696
</pre>



<p>The memory location referenced by <code>our_list1</code> is still <code>139946630082696</code>.&nbsp; The same list is still there, it&#8217;s just been modified.</p>



<p>Now let&#8217;s repeat our integer experiment, again using the <code>id()</code> function to see what is happening in memory:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_int1 = 3
>>> our_int1
3
>>> id(our_int1)
9079072
</pre>



<p>So integer 3 is stored at memory location 9079072.&nbsp; Now let&#8217;s try to change it:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_int1 = 42
>>> our_int1
42
>>> id(our_int1)
9080320
</pre>



<p>So <code>our_int1</code> has not removed the integer <code>3</code> from memory location <code>9079072</code> and replaced it with integer <code>42</code> at location <code>9079072</code>.&nbsp; </p>



<p>Instead it is referencing an entirely new memory location.&nbsp;</p>



<p>Memory location <code>9079072</code> was not changed, it was entirely replaced with memory location <code>9080320</code>.&nbsp; The original object, the integer 3, still remains at location <code>9079072</code>.&nbsp; </p>



<p>Depending on the specific type of object, if it is no longer used it will eventually be removed from memory entirely by Python&#8217;s <a href="https://blog.finxter.com/how-can-i-explicitly-free-memory-in-python/" data-type="post" data-id="35403" target="_blank" rel="noreferrer noopener">garbage collection</a> process. We won&#8217;t go into that level of detail in this article &#8211; thankfully Python takes care of this for us and we don&#8217;t need to worry about it.&nbsp;</p>



<p>We&#8217;ve learned lists can be modified.&nbsp; So here&#8217;s a little puzzle for you.&nbsp; Let&#8217;s try modifying our list variable in a different way:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> our_list1 = ['spam', 'eggs']
>>> id(our_list1)
139946630082696
>>> our_list1  = ['toast', 'eggs']
>>> our_list1
['toast', 'eggs']
>>> id(our_list1)
</pre>



<p>What do you think the id will be?&nbsp; Let&#8217;s see the answer:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> id(our_list1)
139946629319240
</pre>



<p>Woah, a new id!</p>



<p>Python has not modified the original list, it has replaced it with a brand new one.&nbsp; </p>



<p>So lists can be modified, if something like assigning elements is done, but if instead a list is assigned to the variable, the old list is replaced with a new one.&nbsp; </p>



<p><strong>Remember</strong>: What happens to a list, whether being modified or replaced, depends on what you do with it.</p>



<p>However if ever you are unsure what is happening, you can always use the <code>id()</code> function to figure it out.</p>



<h1 class="wp-block-heading" id="mutable-vs-immutable-objects">Mutable vs. Immutable Objects</h1>



<p>So we&#8217;ve explored mutability in Python for data types.&nbsp;</p>



<p>However, this notion applies to more than just data types &#8211; it applies to all objects in Python.&nbsp; </p>



<p>And as you may have heard, EVERYTHING in Python is an object!</p>



<p>The topic of objects, classes, and object-oriented programming is vast, and beyond the scope of this article. You can start with an introduction to Python object-orientation in this blog tutorial:</p>



<ul class="wp-block-list"><li><a rel="noreferrer noopener" href="https://blog.finxter.com/introduction-to-python-classes/" data-type="URL" data-id="https://blog.finxter.com/introduction-to-python-classes/" target="_blank">Introduction to Python Classes</a></li></ul>



<p>Some objects are mutable, and some are immutable.&nbsp; One notable case is programmer-created classes and objects &#8212; these are in general mutable.</p>



<h2 class="wp-block-heading" id="modifying-a-copy-of-a-mutable-object">Modifying a &#8220;Copy&#8221; of a Mutable Object</h2>



<p>What happens if we want to copy one variable to another so that we can modify the copy:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">normal_wear = ['hat', 'coat']
rain_wear = normal_wear</pre>



<p>Our rainy weather wear is the same as our normal wear, but we want to modify our rainy wear to add an umbrella.&nbsp; Before we do, let&#8217;s use <code>id()</code> to examine this more closely:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> id(normal_wear)
139946629319112
>>> id(rain_wear)
139946629319112
</pre>



<p>So the copy appears to actually be the same object as the original.&nbsp; Let&#8217;s try modifying the copy:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> rain_wear.append('umbrella')
>>> rain_wear
['hat', 'coat', 'umbrella']
>>> normal_wear
['hat', 'coat', 'umbrella']
</pre>



<p>So what we learned from <code>id()</code> is true, our &#8220;copy&#8221; is actually the same object as the original, and modifying the &#8220;copy&#8221; modifies the original.&nbsp; So watch out for this!</p>



<p>Python does provide a solution for this through the <code>copy</code> module.&nbsp; We won&#8217;t examine that here, but just be aware of this issue, and know that a solution is available.</p>



<p class="has-global-color-8-background-color has-background"><img src="https://s.w.org/images/core/emoji/17.0.2/72x72/1f4a1.png" alt="💡" class="wp-smiley" style="height: 1em; max-height: 1em;" /> <strong>Note</strong>: immutable objects behave almost the same. When an immutable value is copied to a second variable, both actually refer to the same object. The difference for the immutable case is that when the second variable is modified it gets a brand new object instead of modifying the original.</p>



<h2 class="wp-block-heading" id="bug-risk-and-power-mutable-objects-in-functions">Bug Risk, and Power: Mutable Objects in Functions</h2>



<p>If you&#8217;re not careful, the problem we saw in the last section, modifying a &#8220;copy&#8221; of a variable, can happen when writing a function.&nbsp; </p>



<p>Suppose we had written a function to perform the change from the last section.&nbsp; </p>



<p>Let&#8217;s write a short program <code>dressForRain.py</code> which includes such a function:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">def prepForRain(outdoor_wear):
    outdoor_wear.append('umbrella')
    rain_outdoor_wear = outdoor_wear
    return rain_outdoor_wear

normal_wear = ['hat', 'coat']
print('Here is our normal wear:', normal_wear)
rain_wear = prepForRain(normal_wear)
print('Here is our rain wear:', rain_wear)
print('What happened to our normal wear?:', normal_wear)
</pre>



<p>We know that the data is passed into the function, and the new processed value is returned to the main program.&nbsp; </p>



<p>We also know that the variable created within the function, the parameter <code>outdoor_wear</code>, is destroyed when the function is finished.&nbsp; </p>



<p>Ideally this isolates the internal operation of the function from the main program.&nbsp;&nbsp;</p>



<p>Let&#8217;s see the actual results from the program (A Linux implementation is shown.&nbsp; A Windows implementation will be the same, but with a different prompt):</p>



<pre class="EnlighterJSRAW" data-enlighter-language="generic" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">$ python dressForRain.py
Here is our normal wear: ['hat', 'coat']
Here is our rain wear: ['hat', 'coat', 'umbrella']
What happened to our normal wear?: ['hat', 'coat', 'umbrella']
</pre>



<p>Since variables <code>normal_wear</code> and <code>outdoor_wear</code> both point to the same mutable object, <code>normal_wear</code> is modified when <code>outdoor_wear</code> is appended, which you might not have intended, resulting in a potential bug in your program.&nbsp; </p>



<p>Had these variables been pointing to an immutable object such as a <a rel="noreferrer noopener" href="https://blog.finxter.com/the-ultimate-guide-to-python-tuples/" data-type="post" data-id="12043" target="_blank">tuple</a> this would not have happened. Note, however, tuples do not support append, and a concatenation operation would have to be done instead.</p>



<p>Though we have shown some risk using lists in a function, there is also power as well.&nbsp; </p>



<p>Functions can be used to <a href="https://blog.finxter.com/how-to-apply-a-function-to-each-element-of-a-list/" data-type="post" data-id="28717" target="_blank" rel="noreferrer noopener">modify lists</a> directly, and since the original list is modified directly, no <code>return</code> statement would be needed to return a value back to the main program.</p>



<h2 class="wp-block-heading" id="tuple-mutable-gotcha">Tuple Mutable(?) &#8216;Gotcha&#8217;</h2>



<p>Here is one last, perhaps surprising, behavior to note.&nbsp; We&#8217;ve mentioned that tuples are immutable.&nbsp; </p>



<p>Let&#8217;s explore this a little further with the following tuple:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> some_tuple = ('yadda', [1, 2])</pre>



<p>Let&#8217;s try modifying this by adding <code>3</code> to the list it contains:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> some_tuple[1].append(3)</pre>



<p>What do you think happens?&nbsp; Let&#8217;s see:</p>



<pre class="EnlighterJSRAW" data-enlighter-language="python" data-enlighter-theme="" data-enlighter-highlight="" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-title="" data-enlighter-group="">>>> some_tuple
('yadda', [1, 2, 3])
</pre>



<p>Did our tuple change?&nbsp; No it did not. It still contains the same list &#8211; it is the list within the tuple that has changed.&nbsp; </p>



<p>You can try the <code>id()</code> function on the list portion of the tuple to confirm it&#8217;s the same list.</p>



<h2 class="wp-block-heading" id="why-bother-with-mutable-vs-immutable">Why Bother with Mutable vs. Immutable?</h2>



<p>This mutable/immutable situation may seem a bit complicated.&nbsp; </p>



<p><em><strong>Why did the Python designers do this? Wouldn&#8217;t it have been simpler to make all objects mutable, or all objects immutable?</strong></em></p>



<p>Both mutable and immutable properties have advantages and disadvantages, so it comes down to design preferences. </p>



<p class="has-base-2-background-color has-background"><strong>Advantage</strong>: For instance, one major performance advantage of using <strong>immutable instead of mutable data types</strong> is that a potentially large number of variables can refer to a single immutable object without risking problems arising from overshadowing or <a rel="noreferrer noopener" href="https://en.wikipedia.org/wiki/Aliasing_(computing)" data-type="URL" data-id="https://en.wikipedia.org/wiki/Aliasing_(computing)" target="_blank">aliasing</a>. If the object would be mutable, each variable would have to refer to a copy of the same object which would incur much higher memory overhead.</p>



<p>These choices are affected by how objects are typically used, and these choices affect language and program performance.&nbsp;Language designers take these factors into account when making those choices.&nbsp; </p>



<p>Be aware that other languages address the mutable/immutable topic as well, but they do not all implement these properties in the same way.&nbsp; </p>



<p>We will not go into more detail on this in this article.&nbsp; Your appreciation of these choices will develop in the future as you gain more experience with programming.</p>



<h2 class="wp-block-heading" id="conclusion">Conclusion</h2>



<ul class="wp-block-list"><li>We have noted that Python makes some of its objects mutable and some immutable.&nbsp; </li><li>We have explored what this means, and what some of the practical consequences of this are.&nbsp; </li><li>We have noted how this is a consequence of how objects are stored in memory, and </li><li>We have introduced Python&#8217;s <code><a href="https://blog.finxter.com/python-id-function/" data-type="post" data-id="24087" target="_blank" rel="noreferrer noopener">id()</a></code> function as a way to better follow this memory use.</li></ul>



<p>High-level programming languages are an ever-advancing effort to make programming easier, freeing programmers to produce great software without having to grapple with the minute details as the computer sees it.&nbsp; </p>



<p>Being aware of how mutable and immutable objects are handled in memory is one case where a bit more awareness of the details of the computer will reap rewards.&nbsp; Keep these details in mind and ensure your programs perform at their best.</p>



<hr class="wp-block-separator"/>



<p>The post <a href="https://blog.finxter.com/mutable-vs-immutable-objects-in-python/">Mutable vs. Immutable Objects in Python</a> appeared first on <a href="https://blog.finxter.com">Be on the Right Side of Change</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/?utm_source=w3tc&utm_medium=footer_comment&utm_campaign=free_plugin

Page Caching using Disk: Enhanced 
Minified using Disk

Served from: blog.finxter.com @ 2026-06-23 05:04:06 by W3 Total Cache
-->