Boris Mirkin

From Wikitia
Jump to navigation Jump to search
Boris Mirkin
Add a Photo
Born (1942-12-05) December 5, 1942 (age 81)
NationalityRussian
Alma materSaratov State University
OccupationScientist

Boris Grigorevich Mirkin (Russian: Борис Григорьевич Миркин) is a Russian scientist in data analysis and decision-making methodologies born December 5, 1942. He graduated from the Faculty of Mathematics and Mechanics of Saratov State University in Russia (1964) and defended there a “Candidate of Sciences” degree thesis (1966), which is equivalent to an international PhD degree, on a subject in the abstract automata theory. That was based on correspondences between regular expressions and abstract automata noted by Boris Mirkin and his PhD supervisor Mark Spivak. Prof. J. Brzozowski (University of Waterloo Canada) described these results in several synopses in the Journal of Symbolic Logic in 1969–1971, leading to recognition of what is sometimes referred to as Mirkin’s prebase.

In 1967, B. Mirkin moved to work in the Institute of Economics, Siberian Branch of the USSR Academy of Sciences, Novosibirsk, Russia. There, he started research on binary relations as a medium for decision making since he considered socio-economic decisions as mostly not quantitative (at least, so it was in the USSR). Three results in this direction can be mentioned:

(a) A different characterization of interval orders (1970, published internationally 1972 ;

(b) Development of what later was called Mirkin’s distance between partitions ;

(c) Extension of the celebrated Arrow’s theorem of impossibility of democratic choice to arbitrary relations; in its latest version, that was a characterization of Arrow’s two axioms (the monotonicity and the “independence of irrelevant alternatives”) as the “federation consensus rules”. Given a set of actors with their preference relations Ri, a federation is a set of actor coalitions S={s} so that a federation rule has a set-theoretic format,

B. Mirkin’s first monograph, on mathematics of group choice , propelled him into top ranks of the Soviet mathematics-economics research community, and also opened a way to do research and get published on this subject by scientists in Soviet satellite countries such as Bulgaria.

Then, motivated by a founding father of the Soviet pattern recognition and data science research E. Braverman (1931-1977), B. Mirkin, together with his collaborators, started a program of research projects in mixed scale data analysis oriented towards practical applications. References to E. Braverman can be found in a later volume. Among Mirkin’s results of that period one should mention the following:

(d) Developing a machinery, both models, methods and codes, for finding, in similarity and interaction data, approximate partitions and individual clusters, as well as more weird structures such as a “structured partition” or “chain order partition”, along with interesting applications in genetics, organizational structure design, sociology, and ecology.

(e) Categorical factor analysis, embracing what is referred to as additive cluster analysis, that is sequential extraction of various cluster structures from similarity matrices, together with estimations of contribution of individual structures to the total variance.

(f) Matrix correlation for mixed scale data including quantitative and nominal features, leading to a very successful method for bi-clustering followed by methods for “relative grouping” of sociology surveys over one group of features (say, demographics) with respect to another group of features (say, leisure time behavior).

Unfortunately, most of these were published in Russian only, and currently are mainly forgotten. Three monographs by B. Mirkin should be mentioned, though, as well as edited by him collections . Some traces of these developments can be found in.

From 1983-1991, B. Mirkin worked in CEMI, Central Economics-Mathematics Institute, Moscow, USSR. This result of that period should be mentioned:

(g) Considering clustering within what is currently called the auto-encoder paradigm (referred to by Boris as his “data recovery approach”), B. Mirkin embraces Principal Component Analysis and k-means clustering in a unified framework, currently referred to as matrix factorization model, and proposes a method for “Principal cluster analysis”, currently referred to by him as Anomalous clustering which proved effective in many different frameworks.

From 1991-2011, B. Mirkin travels extensively, completing that with an appointment in Birkbeck University of London UK (2000-2011). These results of that period are worth mentioning:

(h) Building an outline of a sound mathematical theory for non-probabilistic data analysis including a Pythagorean decomposition of the square data scatter in the sum of two items, the k-means square error criterion, the unexplained part, and a complementary criterion, the explained part. The explained part sheds a really new light on such issues as the “real” goal of k-means clustering (finding big anomalous clusters, according to B. Mirkin) and the contributions of nominal features, which appear to coincide with various measures of deviation from the statistical independence, including the celebrated Pearson’s chi-squared association index, which also appear to relate to the data normalization scaling utilized, etc.

(i) Developing the concept of “Quetelet index”, a measure of association between feature categories, which allowed him both to unify “the dual” approaches of the so-called Analyse des Correspondances developed by Benzecri in France, and give an operational meaning to the Pearson’s chi-squared association index which has been ubiquitously considered as a criterion of statistical independence only.

(j) Mathematically modelling inconsistencies between individual gene families and the structure of the evolutionary “species” tree via histories of gene gain and loss events steering the team led by the celebrated E. Koonin to reconstruction of LUCA, the Last Universal Common Ancestor genome , as well as that for lactic acid bacteria. Original models involving duplication events were developed by Mirkin and co-authors as well .

(k) Consensus clustering via the “projective” distance between partitions, which allows for mathematically equivalent formulations of the same criterion in three different frameworks: similarity between objects, object-to-category data table, and association indexes over contingency tables.

From 2011, B. Mirkin works in Higher School of Economics Moscow. These developments by B. Mirkin and Co should be mentioned as of that period:

(l) Core-shell cluster as a cluster with explicitly indicated “core” of tighter connected elements proved fundamental in the three-step cluster analysis of the off-coastal upwelling phenomenon over annual sea surface temperature data (m) Modeling conceptual generalization as optimally lifting fuzzy leaf sets in a domain taxonomy to minimize the numbers of “gaps” and “offshoots” leading to derivation of research domain tendencies or extending audience sizes for internet advertisements.

References

External links

Add External links

This article "Boris Mirkin" is from Wikipedia. The list of its authors can be seen in its historical. Articles taken from Draft Namespace on Wikipedia could be accessed on Wikipedia's Draft Namespace.