Vai al contenuto principale

Dipartimento di Fisica e Scienze della Terra

STAT PHYS at UNIPR

 
Home
 

Entropic Distances and Information Content

 



 

 

The measure of distances and of information content arise naturally in the evolution of sequences of characters. This happens in cluster mobility in spin systems, in characterization of strings in languages, in biological sequences. The potential fields of investigation are extremely variate: emergences of structures in disordered models, analysis of evolutionary trees for biological systems or languages, recognition of authorship in texts, etc. Metric considerations play a central role in all such contexts, because of the necessity of comparisons between configurations. However, distances considered so far generally limit to summarize the asset of local differences. As to the quantitative estimate of information stored in strings, typical tools are based on compressibility algorithms. The related probabilistic concepts mostly refer to the empirical frequencies obtained from databases and historical records. 
 


A radically different approach is going on in the Parma group. It is based on the information required to distinguish two probabilistic schemes. Such a goal is realized by the RohlinÕs distance, a metric concept referring to a Òpartition spaceÓ, where configurations, or states, can be projected. The RohlinÕs functional, improved with supplementary tools and methods specifically designed to amplify and optimize the emergent novelty in phenomena, uses a different probability frame, based indeed on the geometry of partitions instead than on frequencies.

The Rohlin metrics method has already been applied in two main fields:
1) Analysis of cluster dynamics in spin systems, sandpile models and, possibly, in Cellular Automata on arbitrary graphs. Provisional results regard the inspection of chaotic behavior near a putative phase transition, as an instrument able to detect hidden fine structures (1-2).
2) Analysis of emergent phenomena in evoluting RNA viral sequences (3). This constitutes a quite new approach to a field presenting both theoretical and practical aspects of great relevance. The method allows for a "black box" analysis of the amino acids sequences in RNA-strings. In the important case of influence viruses, the method has given evidence to an emergent structure of weak attractors quite compatible with the epidemiological history of the disease. Being independent of biological information from outside, this structure could reveal useful in the vaccine forecast and design. For an updated framework click here

Finally, this method presents a variety of algorithmical and mathematical problems, which constitute objects of research in themselves.   

 

Recent works:

(1) CASARTELLI M., DALLÕASTA L., RASTELLI E., REGINA S. (2004) Metric features of a dipolar model J. Phys.A: Math. Gen. 37 (2004) 11731-11749

(2) AGLIARI E., CASARTELLI M., VIVO E.  (2010) Metric characterization of cluster dynamics on the Sierpinski basket JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT  P09002  doi:10.1088/1742-5468/2010/09/P09002  

(3) BURIONI R, SCALCO R, CASARTELLI M (2011) Rohlin Distance and the Evolution of Influenza A Virus: Weak Attractors and Precursors  PLoS ONE 6(12): e27924. doi:10.1371/journal.pone.0027924