Saturday, April 17, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Neural Networks

Universes that Learn: Cellular Automata Applications and Acceleration | by James Montantes | Nov, 2020

November 10, 2020
in Neural Networks
Universes that Learn: Cellular Automata Applications and Acceleration | by James Montantes | Nov, 2020
585
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

In the MNIST article, repeated application of CA rules eventually cause the cells to settle on a consensus classification (designated by color) for a given starting digit. Applying n updates to an image according to a set of CA rules is not altogether dissimilar to feeding that same image through an n-layer convolutional network, and so it is not surprising that one can use CA to solve a classic conv-net demo problem. In fact, as we’ll see later, CA rules can be implemented effectively as convolutions which means we can take advantage of the substantial development efforts in software, hardware, and systems for deep learning.

CA can do a lot more than “just” simulate physics, however the nature of CA computation doesn’t lend itself to conventional, serial computation on Von Neumann style architectures. To get good performance high-throughput parallelism is required. Luckily, while bespoke, single-purpose accelerators may offer some benefits, we don’t need to develop new accelerators from scratch. We can use many of the same software and hardware tools used to accelerate deep learning to get similar speedups with cellular automata universes.

You might also like

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021

Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021

Artificial Intelligence in Radiology — Advantages, Use Cases & Trends | by ITRex Group | Apr, 2021

It’s useful to keep in mind that Von Neumann developed his 29-state CA using pen and paper, while Conway developed the 2-state GOL by playing with the stones and grid on a Go board. While it would probably be comparatively simple to use a computational search to discover new rules that satisfy the growth-like characteristics Conway was going for in Life, the simple tools used by Neumann and Conway in their work on cellular automata are a nice reminder that Moore’s law is not responsible for every inch of progress.

CA systems, like neural networks, are not particularly well-suited to implementation on typical general purpose computers. These tend to be based on the Von Neumann architecture (although multiple cores and cache memory do stretch the original concept), compute instructions sequentially, and emphasize low latency over parallelism. A cellular automaton universe, on the other hand, is inherently parallel and often massively so. Each individual cell must make identical computations based only on its local context.

The utility of CA systems led to several projects for dedicated CA processors, much like modern interest in deep learning has motivated the development of numerous neural coprocessors and dedicated accelerators. The specialized computers are sometimes called cellular automata machines (CAMs ). In the 1980s and 1990s CAMs were likely to be custom-designed for specific, often one-off purposes. The CAM-brain was an attempt to build a system using a Field Programmable Gate Array (FPGA) to evolve CA structures in order to simulate neurons and neural circuits. Systolic arrays, basically a mosaic of small processors that transform and transport data from and to one another, would seem to be an ideal substrate for implementing CA and indeed there have been several projects (including Google’s TPU) to take this approach. Systolic arrays sidestep one of the most often overlooked hurdles in high-performance computing, that of communications bottlenecks (the motivation behind Nvidia’s NVLink for multi-GPU systems and AMD’s Infinity Fabric.

There have also been algorithmic speed-ups for the most popular CA rules: Hashlife, developed by Bill Gosper in the 1980s, uses memorization to speed up CA computations by remembering previously seen patterns. Just like special purpose accelerators for deep learning (e.g. Graphcore’s IPU or Cerebras’ massive wafer-scale accelerator), single-purpose accelerators for CA computations trade flexibility for speed.

Notable projects for building CA acceleration hardware include the cellular automata machine (CAM) by Norman Margolus and Tommasso Toffoli, which underwent several iterations in the 1980s. The first CAM prototype, described in 1984, could update a grid of 256 by 256 cells at a rate of 60 frames per second, in other words computing nearly 4 million updates a second. This was about a thousand times faster than execution on a comparable general purpose computer at the time. The speed-up was largely accomplished by mapping CA rules to memory and scanning over the grid rather than genuine parallelization. For comparison, the GPU PyTorch implementation in the next section updates more than 128 million cells each second on a personal workstation.

The supercomputer company Thinking Machines also devoted substantial efforts to building a massively parallel architecture for scientific computing. The so-called Connection Machine line of supercomputers were built with thousands of parallel processors arranged in a fashion akin to CA, but they were hardly the sort of computer one might purchase on a whim. The CA-based architecture made Connection Machines well-suited for many challenging scientific computing problems, but the company filed for bankruptcy in 1994.

Another project specifically for simulating neuronal circuits in CA (with the goal of efficiently controlling a robotic cat) was the CAM-brain project spearheaded by Hugo De Garis. The project went on for nearly a decade, building various prototype CA machines amenable to genetic programming and implemented in FPGAs, a sort of programmable hardware. While the project never got as far as their stated goal of controlling a robotic pet, they did develop a spiking neural model called CoDI and published a suite of preliminary experiments in 2001.

The examples mentioned so far have been pretty exotic. Not every researcher is ready to park a room-sized supercomputer in their office, and hard-wired solutions like application-specific integrated circuits (ASICs) are likely to be too static and inflexible for open-ended research. Luckily, as we’ve seen in the work on learning CA published in Distill, we can take advantage of the substantial hardware and software development efforts dedicated to deep learning. Luckily the deep learning tech stack is essentially interchangeable with research and development with cellular automata.

For flexible applications and exploratory development, CA implementations can take advantage of more general purpose GPUs and multicore CPUs for a significant speedup. We’ll investigate a simple demonstration of speeding up a CA system using the deep learning library PyTorch in the final section of this article.

In this simple benchmark we implement Conway’s Game of Life using convolution primitives in PyTorch. We’ll compare the PyTorch implementation on both GPU and CPU devices to a naïve loop-based implementation.

The PyTorch update function defaults to running a single step on the CPU, but these options can be specified by the user:

def gol_step(grid, n=1, device=”cpu”):if torch.cuda.is_available():device = deviceelse:device = “cpu”my_kernel = torch.tensor([[1,1,1], [1,0,1], [1,1,,1]])my_kernel = my_kernel.unsqueeze(0).unsqueeze(0).float().to(device)old_grid = grid.float().to(device)while n > 0:temp_grid = F.conv2d(old_grid, my_kernel, padding=1)#[:,:,1:-1,1:-1]new_grid = torch.zeros_like(old_grid)new_grid[temp_grid == 3] = 1new_grid[old_grid*temp_grid == 2] = 1old_grid = new_grid.clone()n -= 1return new_grid.to(“cpu”)

The naïve implementation scans through the grid using two loops, and it’s the sort of exercise you might implement as a simple “Hello World” when learning a new language. Unsurprisingly, it is very slow.

def gol_loop(grid, n=1):old_grid = grid.squeeze().int()dim_x, dim_y = old_grid.shapemy_kernel = torch.tensor([[1,1,1], [1,0,1], [1,1,1]]).int()while n > 0:new_grid = torch.zeros_like(old_grid)temp_grid = torch.zeros_like(old_grid)for xx in range(dim_x):for yy in range(dim_y):temp_sum = 0y_stop = 3 if yy < (dim_y-1) else -1x_stop = 3 if xx < (dim_x-1) else -1temp_sum = torch.sum(my_kernel[1*(not(xx>0)):x_stop,1*(not(yy>0)):y_stop] * old_grid[max(0, xx-1):min(dim_x, xx+2),max(0, yy-1):min(dim_y, yy+2)])temp_grid[xx,yy] = temp_sumnew_grid[temp_grid == 3] = 1new_grid[old_grid*temp_grid == 2] = 1old_grid = new_grid.clone()n -= 1return new_grid

A simple benchmark script iterates GOL updates ranging from 1 to 6000 steps, but you should feel free to write your own benchmarks script to compare the implementations on your own machine. Also note that I chose a grid size of 256 by 256 to match the 1984 CAM demo, but additional speedup is to be expected with larger grids.

import numpy as npimport timeimport torchimport torch.nn as nnimport torch.nn.functional as Ffor num_steps in [1, 6, 60, 600, 6000]:grid = 1.0 * (torch.rand(1,1,64,64) > 0.50)# naive implementationif num_steps < 601:t0 = time.time()grid = gol_loop(grid, n=num_steps)t1 = time.time()print(“time for {} gol_loop steps = {:.2e}”.format(num_steps, t1-t0))grid = 1.0 * (torch.rand(1,1,256,256) > 0.50)# implementation with PyTorch (CPU)t2 = time.time()grid = gol_step(grid, n=num_steps)t3 = time.time()print(“time for {} gol steps = {:.2e}”.format(num_steps, t3-t2))if num_steps < 601:print(“loop/pt = {:.4e}”.format((t1-t0) / (t3-t2)))grid = 1.0 * (torch.rand(1,1,256,256) > 0.50)# implementation with PyTorch (GPU)t4 = time.time()grid = gol_step(grid, n=num_steps, device=”cuda”)t5 = time.time()print(“time for {} gol steps = {:.2e}”.format(num_steps, t5-t4))if num_steps < 601:print(“loop/pt = {:.4e}, loop/gpupt = {:.4e} pt/gpupt = {:.4e}”.format((t1-t0) / (t3-t2), (t1-t0) / (t5-t4), (t3-t2) / (t5-t4) ))else:print(“pt/gpupt = {:.4e}”.format((t3-t2) / (t5-t4) ))

Credit: BecomingHuman By: James Montantes

Previous Post

Adobe to acquire Workfront for $1.5 billion

Next Post

Air Force Turns to Machine Learning to Fight COVID-19 Disinformation

Related Posts

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021
Neural Networks

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021

April 17, 2021
Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021
Neural Networks

Templates Vs Machine Learning OCR | by Infrrd | Mar, 2021

April 16, 2021
Artificial Intelligence in Radiology — Advantages, Use Cases & Trends | by ITRex Group | Apr, 2021
Neural Networks

Artificial Intelligence in Radiology — Advantages, Use Cases & Trends | by ITRex Group | Apr, 2021

April 16, 2021
A simple explanation of Machine Learning and Neural Networks and A New Perspective for ML Experts | by Akhilesh Ravi | Apr, 2021
Neural Networks

A simple explanation of Machine Learning and Neural Networks and A New Perspective for ML Experts | by Akhilesh Ravi | Apr, 2021

April 15, 2021
Why Corporate AI Projects Fail? Part 2/4 | by Sundeep Teki, PhD | Apr, 2021
Neural Networks

Why Corporate AI Projects Fail? Part 2/4 | by Sundeep Teki, PhD | Apr, 2021

April 15, 2021
Next Post
Air Force Turns to Machine Learning to Fight COVID-19 Disinformation

Air Force Turns to Machine Learning to Fight COVID-19 Disinformation

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Teslafan, a Blockchain-Powered Machine Learning Technology Project, Receives Investment Prior to the ICO
Machine Learning

Teslafan, a Blockchain-Powered Machine Learning Technology Project, Receives Investment Prior to the ICO

April 17, 2021
The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021
Neural Networks

The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021

April 17, 2021
A new collective to fight adtech fraud: Friday’s daily brief
Digital Marketing

A new collective to fight adtech fraud: Friday’s daily brief

April 17, 2021
Cyberattack on UK university knocks out online learning, Teams and Zoom
Internet Security

Cyberattack on UK university knocks out online learning, Teams and Zoom

April 17, 2021
SBI Sumishin Net Bank partners with DLT Labs on supply chain financing network
Blockchain

SBI Sumishin Net Bank partners with DLT Labs on supply chain financing network

April 16, 2021
Machine learning approach identifies more than 400 genes tied to schizophrenia
Machine Learning

Machine learning models may predict criminal offenses related to psychiatric disorders

April 16, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Teslafan, a Blockchain-Powered Machine Learning Technology Project, Receives Investment Prior to the ICO April 17, 2021
  • The “Blue Brain” Project-A mission to build a simulated Brain | by The A.I. Thing | Mar, 2021 April 17, 2021
  • A new collective to fight adtech fraud: Friday’s daily brief April 17, 2021
  • Cyberattack on UK university knocks out online learning, Teams and Zoom April 17, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates