Wednesday, 26 June 2013

Codes, Information, Meaning and Intelligence (and some neuroscience as well)

Note: This has been "under construction" for too long, but it still is. Background is whether a formalism  of necessary and sufficient conditions and rules of induction for psychological phenomena could be postulated in a similar way to the formalisms from which physical theories depart. This would include postulates about the type of system in which such phenomena can be observed and the relevant levels of analysis of such a system. I am posting anyway... 


(Re-) Defining Ontology for Theories of Psychological Phenomena


The use of Codes, Information and Meaning in Theories of  Psychological Science

I have been willing to write about this subject forever, and many have before me, so I make no promises to be telling you something you didn't already know. I can only promise a perspective that should send you off to reconsider some of your assumptions about common terminology used in psychological science: information, meaning and codes.

It was especially an anecdote shared by Walter Freeman in Amsterdam at the 2009 inaugural conference of the Society for Complex Systems in Cognitive Science (SCSCS) that triggered me to sort some things out for myself. He told us that Shannon had visited his lab to talk about the exciting fields of information and communication science he had basically inspired into existence. Shannon had been very clear about one thing, the way physiologists and psychologists were using his concept of information and information-processing in biological systems was wrong, or at least, not properly examined to be right. Now, I do not like arguments of authority, but if the progenitor of a scientific field tells you you need to do some more serious thinking 'till you're certain you're right, I would be very worried. Less so, were the proponents of the computer metaphor of human behaviour and cognition (a contemporary version of mechanistic philosophy confused for experimental philosophy).

What's wrong about information processing in biological systems?
I'm not sure it's in principle wrong to use the term information and processing in relation to the behaviour of living systems, as long as you know what you mean by the words. My point is, that this is not the case in psychological science.

Miller (2003), looking back at a very impressive career, acknowledges that Shannon's theory of information did not get him very far, so he adopted Chomsky's syntax theory:
"I was therefore ready for Chomsky’s alternative to Markov processes. Once I understood that Shannon’s Markov processes could not converge on natural language, I began to accept syntactic theory as a better account of the cognitive processes responsible for the structural aspects of human language. The grammatical rules that govern phrases and sentences are not behavior. They are mentalistic hypotheses about the cognitive processes responsible for the verbal behaviors we observe."
What happens here, in my opinion, is that information gets mixed up with meaning, cognition and code. Of course did Shannon's Markov processes not converge on natural language, they weren't supposed to. Grammatical rules are codes that can give meaning to information, in a very embodied and embedded way, the rules of grammar of natural languages provide us information about the structure of the physical world, not just human language (but that's another story).

Here the grammatical rules are considered evidence of cognitive processes that can explain our verbal behaviour. This seems sympathetic, but our verbal behaviour cannot be neatly described by syntactic rules that knit together packages of information in order to produce meaningful language. The rules, the code, connect the world of auditory signals with the world of our conscious mind by their ability to decode or encode a message as verbal auditory information. In syntactic theory, both meaning, code and information are internalised as cognitive processes or representations.

In chapter 5 (Beyond the static phoneme boundary) of my dissertation I discuss some of these issues and quote Lisker and Abramson who point out the relevant question to ask is what the relation (code) is between the information signal (acoustic signal) and meaning (linguistic expression):
Lisker and Abramson gave an excellent description of this biased view of reality some 40 years ago when discussing the Chomsky and Halle –linguist– theory of speech: “Their concern is not how an articulatory sequence and its associated acoustic signal, both of them physically neither purely continuous nor purely digital in nature, are related to a linguistic expression, but rather to impose digitalization on the physical description in such a way that it will necessarily be a description of the segments in the linguist’s spelling of the expression.” (Lisker & Abramson, 1971, p. 781). 
So, first, I'll try to separately define the concepts of information, meaning and cognition (defined as intelligence) as is common in information theory and physics. Second, I'll try to make sense of the significance of the concept of a code with respect to information and meaning. Third, I'll give an example of the inappropriate use of information, defined as a meaningful representation / decoded information that I used in my dissertation.


1. Intelligence: Making sense of information

Suppose you receive an e-mail filled with Chinese characters and one picture of a pile of blue-ish powder. It's quite easy to measure the information contained in that e-mail, including the image, and express it in bits and bytes. That's what information is, a measurable quantity used in information science and physics. Information and physical systems are linked through the concept of entropy. Shannon Entropy may be loosely defined as the amount of bits needed to describe the unique features of an information structure. Analogously, entropy in physics could be described as the amount of information that is needed to describe the unique modes of behaviour of a physical system. A completely random system has many such modes, is disordered, anything can happenand and thus a high entropy. A deterministic system has low entropy, is highly ordered, only a few things described by a deterministic rule can happen. The information of a physical system are therefore its degrees of freedom and the mathematical description of the entropy of a physical system is equivalent for all intents and purposes to Shannon entropy.

Back to our e-mail. Here's what information is not: Your guess that this must be a spam e-mail about some Asian Viagra alternative. What you have done is decode the information based on expectation, in order to add meaning to it. This is not what information theory and information science is about: There is no intrinsic meaning to information. You used your intelligence (inter-ligere: linking things together) to make sense of the information (see Desurvire, 2009, pp.38). This is the purpose of intelligence in all the different contexts the word can be used, from secret service agencies to scientists studying the behaviour of living systems: Control the flow of information to select what is relevant given current needs, expectations and priorities.

So you ask your friend about the Chinese e-mail, his wife is Asian and she says it's not a spam e-mail but some kind of poetry or prose. Finally a professor of Chinese Literature you contacted kindly tells you it's a an old poem praising the benefits of digesting large quantities of grounded rhinoceros tusk for keeping a marriage exciting after 50 years. You just got the mail by mistake, apparently a lot of people have an e-mail address that starts with B.Lee@ as well in China.

You weren't that far off using your intelligence! The devil is of course in the details, as always.


1.2 Intelligence as a postulated condition for psychological phenomena

[This general point is made much better than I can in "On Intelligence From First Principles: Guidelines for Inquiry Into the Hypothesis of Physical Intelligence (PI)" Turvey & Carello (2012)]

Sometimes we forget, but what Psychological Science studies always boils down to intelligence as defined above. Most of the differences between theories in Psychological Science concern differences in the proposed mechanisms or ontology for intelligent behaviour as the end product of our ability to control the flow of information to select what is relevant given current needs, expectations and priorities. 

The strength of this definition is that it does not make any ontological claims about how humans, and other living systems actually achieve intelligent behaviour. That is a strength, because it helps us define what a psychological phenomenon is and what not. In fact, perhaps better to use agent-environment system, as this includes nonliving, artificial systems that can behave intelligently in a dynamically changing environment. Other postulates for a formalism of psychological phenomena could restrict the domain of psychological phenomena, but I believe it is important to not interpret the words control and select as actions by an agent, but as descriptors of the phenomenon.

Let's attempt a scientific description of the professor decoding the e-mail. Intelligence as defined here can used by a theory to:
- Suggest a computational integration algorithm that collates relevant perceptual input in order to match it to an internal mental representation of an exemplar category of grounded rhinoceros tusk.
- Propose the visual information will resonate with the concept of grounded rhinoceros tusk, that is an aggregate of parallel distributed networks of time-locked sensory experiences with tusks, powders and what not.
- Argue that there is a match between the needs, expectations and priorities emerging from the biophysical properties of the organism and the opportunities for action allowed for by the physical structures in its sociocultural environment (affordance).  

The weakness of this definition is that this leaves a lot of room to fanny about.


2.1 Codes, Organic and otherwise.

[The main reference for most of what I post in this paragraph is: The Organic Codes (Barbieri, 2003)]

The Chinese e-mail example shows that one source of information can be given different meanings, depending on the code that one uses. Barbieri defines the following about codes:
  1. They are rules of correspondence that connect two independent worlds 
  2. They give meaning to information structures
  3. They are collective, or community rules, conventions that do not depend on the individual features of their structures
With meaning and information defined as independent entities, an example can be given of different and independent evolutions of information and meaning (Barbieri, 2003, pp. 97).
Evolution of information without a change of meaning (informatic process) :
Pater -> Padre, Père, Vater, Father

 Evolution of meaning without a change of information (semantic process):
Ape -> In English: A tailless primate, in Italian: A honey-making insect

In the first case the same meaning is described by different information structures, the second case the same information structure describes different meanings. In the case of writing systems and natural languages it is clear how these rules of correspondence can be seen as community rules or conventions. Barbieri argues that the organic codes (also see Barbieri, 2006), like the genetic code are not very different from our writing systems:
  1. Two independent worlds are connected: nucleic acids and proteins
  2. Genetic and epigenetic processes (transcription, translation, splicing, etc.) form codified assemblies by means of specific (meaningful) sets of correspondence rules between the independent worlds.
  3. It is a convention of nature to do so.
So, a code is a relational concept, not a "thing" and therefore an excellent subject for science to study: "The aim of science is not things themselves, as the dogmatists in their simplicity imagine, but the relation between things." (Poincaré, 1905, pp. xxiv).

There is of course more to tell, for instance that codes need (more than one kind of) memory in order to be able to give meaning. I believe a generalisation of the concept of degeneracy as suggested by Edelman and Gally (2001) provides more than enough theory to deal with storage and "representation hungry" problems of cognition.

Oh! There it is, representation.


2.2 Enter Psychological Science: Representation.

My impression is that theories in psychological science that use the concept of the (mental) representation, do so when they want to refer to a mixture of codeinformation and meaning. As I wrote in a previous post, my favourite example is the scientific theory about reading aloud written text that involves converting graphemes into phonemes. What are graphemes? Well, they are abstract representations of letters. What are phonemes? They are abstract representations of speech sounds. So reading printed or written text out loud is in fact converting letters into sounds by applying Grapheme-to-Phoneme Conversion rules (GPC-rules)

...but that is the same as describing what sounding-out letters is, just using words we made up: Apply the bloblob to pleplep conversion rules to convert letters (bloblobs) into sounds (plepleps).

Why would we need graphemes and phonemes to explain this phenomenon scientifically? The information structures are the printed words. The codes are the conventional rules of the writing system and applying the rules gives these information structures meaning in terms of pronunciation, or sequences of speech gestures. Note that the meaning of the words themselves, used as a tool of language and communication, is of another order and requirers many more, mainly cultural conventions and intelligence, like the codified assemblies of epigenesis. As shown above, one can easily invent pseudo-words and neologisms that have no meaning (yet), but can be read aloud by applying the code. As was shown by the Chinese e-mail example, the interpretation or meaning given to the information structure depends on the codes and intelligence used. (A poem in this context is a good example of an information structure whose meaning can evolve within a lifetime).

The only thing that matters to the agent-environment system that wants to behave intelligently when confronted with a stream of information in the form of strings of printed characters is to learn the code, the GPC-rules (that may be rather fuzzy and will contain many exceptions and perhaps contextual contradictions) that connect the world of written language to the world of spoken language through codified assemblies. All the representing is achieved by the information structures allowed for by the writing system on the one hand, on the other hand by the speech signal that comprises the auditory information structures that can be be given meaning by the codes of the language in question. What matters are the relations between the things.


2.3 (A/U)n informed summary.

In many cases, the additional internal storage of information structures seems unnecessary (this resembles Andy Clark's 007-principle: "Know only as much as you need to know to get the job done"). Where the codes relevant to psychological phenomena reside (if anywhere), how science can describe them formally and and how they can be acquired and applied by a living system, is what should be the topic of scientific debate.

This is not a simple matter, as Walter Freeman has shown in his study of the olfactory bulb of Rabbits: After a molecule (an information structure representing the scent of a banana) touches the sensory organ, the specificity of the neural pattern (a possible anchor for meaning) is already lost after the first  layers of neurons. Whatever information processing in biological systems might look like, it is most certainly not like any kind of traceable information structure passed around the nervous system as a neat package.

A discussion of many of theses issues with representation and information and intelligent behaviour by agent-environment systems can be found in Radical Embodied Cognitive Science by Tony Chemero. Tony was kind enough to jump in last-minute, on a symposium I had organised about issues with  psychological theorising at the 33rd anual meeting of the cognitive science society, CogSci2011 in Boston (I also organised one at ICPA, same year). Be sure to check his papers on theory of affordances and hypersets.


To conclude, it seems that at least three evident category mistakes are possible when the term representation is not properly defined with respect to information, meaning or code:

  1. One and the same information structure (i.e., ape) that is given N different meanings by N different codes, is taken for N different information structures that need to be internally represented (a mental lexicon / library / file-drawer / similarity neighbourhood).
  2. When different information structures, by means of an intelligently selected codified assembly, can be given one and the same meaning (father written in different alphabetic languages), the (ad-hoc) assembly of codes is taken for an information structure that needs to be internally represented (a category, exemplar representation, stereotype, ).
  3. A code, or an assembly of codes, will be mistaken for an independent cognitive process, module or component in the cognitive architecture, when it is in fact a representation of an acquired set of (conventional) regularities between different information structures in the environment of the agent.



Finally, the brain.


3. Circular circularity: The speech mode in auditory perception

Most of this is from my dissertation, chapter 5: Beyond the static phoneme boundary.

As an example of mistake 1 above, the problems that arise when qualitatively different intelligent responses to the very same stream of information are interpreted in terms of an existence proof for separate discrete abstract representations of encoded information, consider the following studies on the The speech mode in auditory perception (Serniclaes, Sprenger-Charolles, Carre, & Demonet, 2001).

When the same set of sine wave stimuli (same information structure) are introduced as electronic whistling sounds that need to be discriminated (use a specific codified assembly: whistles, pitch, envelope, etc.) or as speech-like sounds they need to discriminate (use another (overlapping?) set of codes: speech), the second instruction causes participants to perceive a phoneme boundary (result of intelligent control of the information stream: pick up only speech-relevant information). This would be expected with discrimination of actual speech sounds only. The perceptual boundary is not observed for the exact same stimuli when the first instruction is used (result of intelligent intelligent control of the information stream: pick-up only pitch relevant information. This effect has been interpreted as a so-called ‘speech mode’ of auditory perception, in which a non-speech stimulus only resonates with the representation of a speech sound when the instruction is given to discriminate speech sounds.

The speech mode has neural correlates (Dehaene-Lambertz et al., 2005). Several experiments are reported in the study that basically uses the same instruction variation as mentioned above, adapted for measurements of brain activity using EEG, MEG, and fMRI. The authors report three main conclusions, of which the first is a confirmation of the instruction effect found by Serniclaes et al. (2001). 

 Text taken from the paper by Dehaene-Lambertz et al., 2005:
 “First, the same auditory stimuli are processed differentially depending on whether they are perceived as speech or as nonsense electronic whistles. Second, the posterior part of the superior temporal sulcus and the supramarginal gyrus are crucial areas for syllable processing but are not involved in the processing of the same physical dimension when the stimuli are not perceived as speech. Third, non-phonemic auditory representation and phonemic representation are computed in parallel, but the phonemic network is more efficient and its activation may have an inhibitory effect on the acoustical network.” (Dehaene-Lambertz et al., 2005, pp. 32).
The first conclusion seems straightforward, but is already circular: The same stimuli are processed differently depending on how they are perceived. Ask yourself this, how does the brain of a speaker know whether a sound will be perceived as a tone or as speech-sound, before it is perceived? Of course, in the context of the experiment the participant is alerted to intelligently decode either sound or speech-relevant information by the instruction, but how does this generalize to real life situations in which there is no instruction? The same problem applies to the second conclusion: How is it possible the brain regions that supposedly "compute" the representations of syllables, are only active when the sounds are (going to be) perceived as syllables when computation has finished? 

As a pure description of the experimental findings by Dehaene-Lambertz et al., (2005) these conclusions might be acceptable when interpreted as reporting a correlation: This is what we observe when a stimulus is perceived as speech rather than as a sound. The circularity emerges due to the suggestion that different brain regions are involved in the processing of the same physical dimension of an auditory signal dependant on the outcome of this process. The third conclusion seems to confirm the assumption that the authors are not just reporting a correlation, but interpreting a difference they observed as the causal power of the phonemic representation to suppress an auditory percept.

This study inflates the amount of representing being done by the brain. Apparently the same auditory stimulus has two different representations (representing what exactly?, the information is the same!). One is non-phonemic and is processed by the auditory network, whereas the other is phonemic and processed by the phonemic network. This means the authors suggest auditory stimuli with physical dimensions that may be perceived as speech sounds -but are not- have a speech-sound representation as well as an auditory representation?

It is almost impossible to break out of the circular reasoning.

(told you it was still under construction)




References


Barbieri, M. (2006). Life and semiosis: The real nature of information and meaning. Semiotica, 2006(158), 233–254. doi:10.1515/SEM.2006.007
Dehaene-Lambertz, G., Pallier, C., Serniclaes, W., Sprenger-Charolles, L., Jobert, A., & Dehaene, S. (2005). Neural correlates of switching from auditory to speech perception. NeuroImage, 24(1), 21–33. doi:10.1016/j.neuroimage.2004.09.039
Edelman, G. M., & Gally, J. a. (2001). Degeneracy and complexity in biological systems. Proceedings of the National Academy of Sciences of the United States of America, 98(24), 13763–8. doi:10.1073/pnas.231499798
Miller, G. a. (2003). The cognitive revolution: a historical perspective. Trends in Cognitive Sciences, 7(3), 141–144. doi:10.1016/S1364-6613(03)00029-9
Serniclaes, W. (2001). Perceptual Discrimination of Speech Sounds in Developmental Dyslexia. Journal of Speech, Language, and Hearing Research, 44(2), 384–399. doi:10.1044/1092-4388(2001/032)
Turvey, M. T., & Carello, C. (2012). On Intelligence From First Principles: Guidelines for Inquiry Into the Hypothesis of Physical Intelligence (PI). Ecological Psychology, 24(1), 3–32. doi:10.1080/10407413.2012.645757