Return to home page

(encryption key)

Official webpage


Fredrik Sandin,
Luleå tekniska universitet,
971 87 Luleå,

Publications with Bibtex

Profile at Google Scholar

C Binary Vector Symbols (CBVS)


CBVS is an implementation of Binary Vector Symbols (BVS), Sparse Distributed Memory (SDM) and the Analogical Mapping Unit (AMU) in the C programming language. These models belong to a class of computational cognitive architectures called Vector Symbolic Architectures (VSAs), which represent symbols and relationships with random points, distributions and mathematical operations in high-dimensional spaces. Research on VSAs dates back several decades and involves about two hundred authors (Ross Gayler keeps a list of contributions).

This software is a result of the collaboration with Ph.D. student Blerim Emruli and Dr. Ross Gayler. See our paper in Cognitive Computation and the paper at IJCNN 2013 and references therein for further information. Work in this direction was stimulated by questions from Prof. Jerker Delsing at EISLAB concerning the possibilities to use “brain-like” computational principles in the design of future automation systems. Alternative approaches include neuromorphic engineering, conventional probabilistic machine learning with structured representations, and more general mathematical approaches to universal intelligence.

The following figure illustrates an example of inference using the AMU (more information below).

RPM example

Distributed representation of working memory

VSAs use random distributed representations of symbols, which are different compared to the structured representations and numbers that scientist and engineers typically use. The use of distributed representations have interesting consequences for how information can be encoded and processed: Multiple chunks of information can be superposed dynamically and the superpositions can be processed without prior decomposition; The capacity mimics Miller's law and is determined by the signal to noise ratio (SNR), see the papers by Kanerva (1997) and Linhares, Chada & Aranha (2011); The standard deviation of the SNR decreases with increasing dimensionality and simulations suggest that the standard deviation is negligible compared to the expectation value at a dimensionality of about 1e4, which coincidentally is comparable to the number of synapses of biological neurons; There are simple mechanisms for dynamic binding and coding of sequences with random permutations; The representations are robust to bit errors.

Sparse distributed long-term memory

The SDM is based on the idea that distances between concepts in memory correspond to distances between points in a high-dimensional space, see for example the book on sparse distributed memory as a mathematical model of long-term memory by Pentti Kanerva (1988). See also the paper by Abbott, Hamrick & Griffiths (2013) on approximating Bayesian inference with an SDM and the paper by Anderson (1989) showing that an SDM does Monte Carlo approximation of a multidimensional conditional probability integral. Interestingly, Curto & Itskov (2008) find that hippocampal place cells adapt to the structure of the stimulus space so that distances are accurately represented by simultaneously active groups of neurons, up to an overall scale factor. Does that principle generalize to other cells and other types of symbols, for example to concept cells and conceptual relations? Is it a consequence of a general topological adaptation mechanism of hippocampal memory circuits?

Limitations and directions for research

VSAs have limitations that partially are understood and discussed in the references above. In particular, the approach is not based on first principles. The rich dynamics and interactions of real neurons (including dendrites, synapses, glia and chemical substances) and complex networks of brain circuits are not accounted for in these models. The hope to develop an effective theory of cognitive processes rests on the possibility that brain circuits are subject to parameter space compression like other physical systems that allow for accurate descriptions of long-scale observables in terms of tractable mathematical models. Another limitation is the problem of infering VSA representations of symbols from unstructured sensor information (grounding), which is not much studied and needs further work related to pattern recognition. I find these toy models of cognitive processes interesting because the assumptions and principles are simple, still some characteristics similar to those of cognitive processes in humans emerge. Unlike action potentials the elementary representations of a VSA can be communicated over a network with moderate requirements on timing, and the high tolerance to noise allow for communication over noisy channels. Results obtained in experiments with animals is a natural source of information for further development. For example, recent evidence that bees can learn abstract conceptual and relational rules is an interesting possibility to learn more about cognitive processes in a system with relatively few neurons (less than 1 million vs 100 billions in humans), see the work by Aurore Avargues-Weber, Adrian G. Dyer and Martin Giurfa (2011). The mathematical basis and motivation needs further development (bayesian statistics, information theory, statistical physics) and the models need to be challenged with applications involving unstructured information.

Programming languages are designed to be precise, which results in complex, error prone and expensive programs in many applications. We need new computational principles and methodologies to design software for smart homes, robots and other systems that interact with people in complex and changing environments. VSAs are by no means a solution to that problem, but offers an alternative approach to think about it.

VSAs and spiking neural networks

The Neural Engineering Framework developed by Chris Eliasmith and his group implements VSA mechanisms in spiking neural networks, see Nengo for further information. The work by Tony Plate on Holographic Reduced Representations (HRR) plays a central role in that context, see for example his book on HRR for further information.

CBVS code

The motivation for developing this C code is that VSA simulations can be prohibitive when the bitwise operations are not handled carefully. The present implementation requires about one clock cycle per bit and activated hard location in an SDM/AMU for retrieval of symbols and slightly more for storage. This implementation is used for research purposes and it is provided as is under GPL. Applications may require an FPGA or distributed implementation, depending on how many hard locations that are needed. The memory examples described by Kanerva in 1988 with one million hard locations require about ten billion bytes of high-speed storage, which is feasible today. The next version of the code will include a compile-time option to calculate the variance of activated SDM counters.

Example 1

The code includes a file named main.c that demonstrates the core functions of CBVS:

> ./CBVS

Demonstration of C binary vector symbols (CBVS) toolbox.
- Representation of compositional structures.
- Sparse distributed memory (SDM).
- Analogical mapping unit (AMU).
Vectors are 8192 bits (128 long words).

Chunking fidelity test, probing one filler in N chunks
correlation with 1 chunks: 1.000
correlation with 3 chunks: 0.511
correlation with 5 chunks: 0.376
correlation with 7 chunks: 0.310
correlation with 9 chunks: 0.274
correlation with 11 chunks: 0.255
correlation with 13 chunks: 0.234
correlation with 15 chunks: 0.220

Demo of chunking with variadic macro
correlation with 3 chunks: 0.511
correlation with 5 chunks: 0.376

SDM storage and retrieval demo with 100 locations
and 10% active locations per write/read operation
correlation between y and y' with 25 stored vectors: 0.892
correlation between y and y' with 50 stored vectors: 0.710
correlation between y and y' with 75 stored vectors: 0.611
correlation between y and y' with 100 stored vectors: 0.540
correlation between y and y' with 125 stored vectors: 0.496
correlation between y and y' with 150 stored vectors: 0.415
correlation between y and y' with 175 stored vectors: 0.414
correlation between y and y' with 200 stored vectors: 0.409
Retrieval efficiency: 1.0 clock cycles per bit and location
Storage efficiency: 1.3 clock cycles per bit and location

AMU inference demo using an SDM with 1000 locations.
Solving a Ravens Progressive Matrix on the form:
   A --> AA --> AAA
   B --> BB --> BBB
   C --> CC -->  ?
Correlation between mapping and alternative targets:
1. CCC  ~ 0.553
2. C    ~ 0.389
3. BBB  ~ 0.377
4. CC   ~ 0.310
5. CCCC ~ 0.321
6. B    ~ 0.265
7. DDD  ~ 0.256
8. BBBB ~ 0.205
9. BB   ~ 0.192

Probing fillers in mapping result:
Shape A ~ 0.105
Shape B ~ 0.137
Shape C ~ 0.380
Shape D ~ -0.017
Number 1 ~ 0.135
Number 2 ~ 0.024
Number 3 ~ 0.225
Number 4 ~ -0.010
Position 1 ~ 0.137
Position 2 ~ -0.003
Position 3 ~ 0.250
Position 4 ~ 0.002

The average of the correlations calculated in the AMU inference demonstration is illustrated in the figure at the top of the page, which is taken from the IJCNN2013 paper.

Download and installation

You can download CBVS here and the GPL license here.

Extract the archive and type make on a command line to compile CBVS.

License and terms of use

This is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.


The following paper describes the model implemented in CBVS and can be cited as reference.

        title={Analogical Mapping with Sparse Distributed Memory:
                A Simple Model that Learns to Generalize from Examples},
        author={Emruli, Blerim and Sandin, Fredrik},
        journal={Cognitive Computation},
        note = {Model is available at}


Etherpad Latex scripts (github)
Femtolensing calculator
Random indexing
SSH-distributed calculations

Technical reports

TDP derivatives (extends Appendix E/F)
Fault detection in district energy systems
Intelligent Industrial Processes

Bulletin board

Thu Oct 08, 2015
Master thesis proposal on embedded streaming text analytics.

Wed Oct 07, 2015
Gunnar Öquist fellows, vice-chancellors and representatives from research funding organisations in Sweden were invited to IVA by the Kempe Foundations to present their view on and discuss the fellowship programme. A very interesting and stimulating meeting.

Thu Sep 24, 2015
Master thesis proposal on learning condition monitoring system.

Mon Sep 14, 2015
Dr. Vasaki Ponnusamy from Quest International University in Malaysia is visiting us for one month with support from STINT.

Mon Aug 31, 2015
Sergio visits EUSIPCO 2015 and presents our paper.

Wed May 27, 2015
Sergio Martin del Campo Barraza presents his licentiate thesis on autonomous sensor systems. Congratulations! Thanks to the opponent, Karl Skretting, for his helpful questions at the seminar.

Mon Apr 27, 2015
Participated at CapoCaccia, great event! Implemented a realtime Matlab / Mex interface (screenshot) for the DAVIS device and learned more about criticality and the ROLLS.

Fri Dec 12, 2014
Gunnar Öquist Fellow ceremony. A memorable day and a great and timely opportunity. More info here.

Thu Nov 13, 2014
Received an encouraging letter from the student union. Thank you for nominating me!

Fri Nov 7, 2014
Presenting our work on interoperability by learning at BICA 2014 in Boston.

Wed Oct 1, 2014
Blerim defends his Ph.D. thesis. Congrats! See also the news article.

Thu Jun 5, 2014
Workshop on bio-inspired computation, organized at UTP, Malaysia. Thanks to STINT, UTP and MMU for support.

Fri Apr 25, 2014
Ph.D. student Sergio Martin del Campo attends MLSS2014 and presents a poster.

News archive.

Valid XHTML 1.0 Transitional