fredrik.sandin@gmail.com

(encryption key)

Fredrik Sandin,

EISLAB,

Luleå tekniska universitet,

971 87 Luleå,

Sweden.

CBVS is an implementation of Binary Vector Symbols (BVS), Sparse Distributed Memory (SDM) and the Analogical Mapping Unit (AMU) in the C programming language. These models belong to a class of computational cognitive architectures called Vector Symbolic Architectures (VSAs), which represent symbols and relationships with random points, distributions and mathematical operations in high-dimensional spaces. Research on VSAs dates back several decades and involves about two hundred authors (Ross Gayler keeps a list of contributions).

This software is a result of the collaboration with Ph.D. student Blerim Emruli and Dr. Ross Gayler. See our paper in Cognitive Computation and the paper at IJCNN 2013 and references therein for further information. Work in this direction was stimulated by questions from Prof. Jerker Delsing at EISLAB concerning the possibilities to use “brain-like” computational principles in the design of future automation systems. Alternative approaches include neuromorphic engineering, conventional probabilistic machine learning with structured representations, and more general mathematical approaches to universal intelligence.

The following figure illustrates an example of inference using the AMU (more information below).

VSAs use random distributed representations of symbols, which are different compared to the structured representations and numbers that scientist and engineers typically use. The use of distributed representations have interesting consequences for how information can be encoded and processed: Multiple chunks of information can be superposed dynamically and the superpositions can be processed without prior decomposition; The capacity mimics Miller's law and is determined by the signal to noise ratio (SNR), see the papers by Kanerva (1997) and Linhares, Chada & Aranha (2011); The standard deviation of the SNR decreases with increasing dimensionality and simulations suggest that the standard deviation is negligible compared to the expectation value at a dimensionality of about 1e4, which coincidentally is comparable to the number of synapses of biological neurons; There are simple mechanisms for dynamic binding and coding of sequences with random permutations; The representations are robust to bit errors.

The SDM is based on the idea that distances between concepts in memory correspond to distances between points in a high-dimensional space, see for example the book on sparse distributed memory as a mathematical model of long-term memory by Pentti Kanerva (1988). See also the paper by Abbott, Hamrick & Griffiths (2013) on approximating Bayesian inference with an SDM and the paper by Anderson (1989) showing that an SDM does Monte Carlo approximation of a multidimensional conditional probability integral. Interestingly, Curto & Itskov (2008) find that hippocampal place cells adapt to the structure of the stimulus space so that distances are accurately represented by simultaneously active groups of neurons, up to an overall scale factor. Does that principle generalize to other cells and other types of symbols, for example to concept cells and conceptual relations? Is it a consequence of a general topological adaptation mechanism of hippocampal memory circuits?

VSAs have limitations that partially are understood and discussed in the references above. In particular, the approach is not based on first principles. The rich dynamics and interactions of real neurons (including dendrites, synapses, glia and chemical substances) and complex networks of brain circuits are not accounted for in these models. The hope to develop an effective theory of cognitive processes rests on the possibility that brain circuits are subject to parameter space compression like other physical systems that allow for accurate descriptions of long-scale observables in terms of tractable mathematical models. Another limitation is the problem of infering VSA representations of symbols from unstructured sensor information (grounding), which is not much studied and needs further work related to pattern recognition. I find these toy models of cognitive processes interesting because the assumptions and principles are simple, still some characteristics similar to those of cognitive processes in humans emerge. Unlike action potentials the elementary representations of a VSA can be communicated over a network with moderate requirements on timing, and the high tolerance to noise allow for communication over noisy channels. Results obtained in experiments with animals is a natural source of information for further development. For example, recent evidence that bees can learn abstract conceptual and relational rules is an interesting possibility to learn more about cognitive processes in a system with relatively few neurons (less than 1 million vs 100 billions in humans), see the work by Aurore Avargues-Weber, Adrian G. Dyer and Martin Giurfa (2011). The mathematical basis and motivation needs further development (bayesian statistics, information theory, statistical physics) and the models need to be challenged with applications involving unstructured information.

Programming languages are designed to be precise, which results in complex, error prone and expensive programs in many applications. We need new computational principles and methodologies to design software for smart homes, robots and other systems that interact with people in complex and changing environments. VSAs are by no means a solution to that problem, but offers an alternative approach to think about it.

The Neural Engineering Framework developed by Chris Eliasmith and his group implements VSA mechanisms in spiking neural networks, see Nengo for further information. The work by Tony Plate on Holographic Reduced Representations (HRR) plays a central role in that context, see for example his book on HRR for further information.

The motivation for developing this C code is that VSA simulations can be prohibitive when the bitwise operations are not handled carefully. The present implementation requires about one clock cycle per bit and activated hard location in an SDM/AMU for retrieval of symbols and slightly more for storage. This implementation is used for research purposes and it is provided as is under GPL. Applications may require an FPGA or distributed implementation, depending on how many hard locations that are needed. The memory examples described by Kanerva in 1988 with one million hard locations require about ten billion bytes of high-speed storage, which is feasible today. The next version of the code will include a compile-time option to calculate the variance of activated SDM counters.

The code includes a file named main.c that demonstrates the core functions of CBVS:

> ./CBVS ******************************************************** Demonstration of C binary vector symbols (CBVS) toolbox. - Representation of compositional structures. - Sparse distributed memory (SDM). - Analogical mapping unit (AMU). Vectors are 8192 bits (128 long words). ******************************************************** ------------------------------------------------------ Chunking fidelity test, probing one filler in N chunks ------------------------------------------------------ correlation with 1 chunks: 1.000 correlation with 3 chunks: 0.511 correlation with 5 chunks: 0.376 correlation with 7 chunks: 0.310 correlation with 9 chunks: 0.274 correlation with 11 chunks: 0.255 correlation with 13 chunks: 0.234 correlation with 15 chunks: 0.220 ----------------------------------- Demo of chunking with variadic macro ------------------------------------ correlation with 3 chunks: 0.511 correlation with 5 chunks: 0.376 ------------------------------------------------- SDM storage and retrieval demo with 100 locations and 10% active locations per write/read operation ------------------------------------------------- correlation between y and y' with 25 stored vectors: 0.892 correlation between y and y' with 50 stored vectors: 0.710 correlation between y and y' with 75 stored vectors: 0.611 correlation between y and y' with 100 stored vectors: 0.540 correlation between y and y' with 125 stored vectors: 0.496 correlation between y and y' with 150 stored vectors: 0.415 correlation between y and y' with 175 stored vectors: 0.414 correlation between y and y' with 200 stored vectors: 0.409 Retrieval efficiency: 1.0 clock cycles per bit and location Storage efficiency: 1.3 clock cycles per bit and location --------------------------------------------------- AMU inference demo using an SDM with 1000 locations. Solving a Ravens Progressive Matrix on the form: A --> AA --> AAA B --> BB --> BBB C --> CC --> ? --------------------------------------------------- Correlation between mapping and alternative targets: 1. CCC ~ 0.553 2. C ~ 0.389 3. BBB ~ 0.377 4. CC ~ 0.310 5. CCCC ~ 0.321 6. B ~ 0.265 7. DDD ~ 0.256 8. BBBB ~ 0.205 9. BB ~ 0.192 Probing fillers in mapping result: Shape A ~ 0.105 Shape B ~ 0.137 Shape C ~ 0.380 Shape D ~ -0.017 Number 1 ~ 0.135 Number 2 ~ 0.024 Number 3 ~ 0.225 Number 4 ~ -0.010 Position 1 ~ 0.137 Position 2 ~ -0.003 Position 3 ~ 0.250 Position 4 ~ 0.002

The average of the correlations calculated in the AMU inference demonstration is illustrated in the figure at the top of the page, which is taken from the IJCNN2013 paper.

You can download CBVS here and the GPL license here.

Extract the archive and type make on a command line to compile CBVS.

This is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

The following paper describes the model implemented in CBVS and can be cited as reference.

@Article{Emruli:2013cc, title={Analogical Mapping with Sparse Distributed Memory: A Simple Model that Learns to Generalize from Examples}, author={Emruli, Blerim and Sandin, Fredrik}, year={2014}, volume={6}, number={1}, issn={1866-9956}, journal={Cognitive Computation}, publisher={Springer-Verlag}, pages={74-88}, doi={10.1007/s12559-013-9206-3}, url={http://pure.ltu.se/portal/files/42274072/BlerimEmruli_2013a.pdf}, note = {Model is available at http://pendicular.net/cbvs.php} }

3FCS

ASOUND

CBVS

Dircheck

Etherpad Latex scripts (github)

Femtolensing calculator

Random indexing

Shellmap

SSH-distributed calculations

TDP derivatives (extends Appendix E/F)

Fault detection in district energy systems

Intelligent Industrial Processes

Mutex locked... :-)

**Mon Sep 14, 2015**

Dr. Vasaki Ponnusamy from Quest International University in Malaysia is visiting us for one month
with support from STINT.

**Mon Aug 31, 2015**

Sergio visits EUSIPCO 2015 and presents our
paper.

**Wed May 27, 2015**

Sergio Martin del Campo Barraza presents his
licentiate thesis on autonomous sensor systems.
Congratulations!
Thanks to the opponent, Karl Skretting,
for his helpful questions at the seminar.

**Mon Apr 27, 2015**

Participated at CapoCaccia, great event!
Implemented a realtime Matlab / Mex interface
(screenshot)
for the DAVIS device and learned more about
criticality and the
ROLLS.

**Fri Dec 12, 2014**

Gunnar Öquist Fellow ceremony.
A memorable day and a great and timely opportunity.
More info here.

**Thu Nov 13, 2014**

Received an encouraging letter from the student union. Thank you for nominating me!

**Fri Nov 7, 2014**

Presenting our work on interoperability by learning
at BICA 2014 in Boston.

**Wed Oct 1, 2014**

Blerim defends his
Ph.D. thesis.
Congrats! See also the news article.

**Thu Jun 5, 2014**

Workshop on bio-inspired computation, organized at UTP, Malaysia.
Thanks to STINT, UTP and MMU for support.

**Fri Apr 25, 2014**

Ph.D. student Sergio Martin del Campo attends MLSS2014
and presents a poster.

News archive.