Sherman Visual Lab: Science Studio, Computer Science
08/22/2017 Tuesday

eEducation, eBusiness & eArts

Home  |  Science Studio  |  eBusiness  |  Art Studio  |  Services  |  Contact Info  |  ~xmwang  
Physics Lab  |  Computer Science  |  Mathematical Physics  |  Order  | 
Computer Science

From Dirac Notation to Probability Bracket Notation,

Information Retrieval (IR) and Artificial Intelligence (AI)


Author: Dr. Xing M (Sherman) Wang

Dirac notation (or Bra-ket notation) is a very powerful and indispensable tool for modern physicists. Unfortunately it is only taught in Quantum Mechanics . I believe it would be great to introduce it in Applied Mathematics (like Linear Algebra ).

On the other hand, while studying probability theories, I felt that it would be very helpful if we had a similar notation to represent or derive probabilistic formulas .

That was why I posted following articles online, in which Dirac Notation was insroduced to IR, and Probability Bracket Notation was proposed and applied to IR and AI.

No mater you agree or disagree with my work, I welcome and appreciate your opinions.

How to email to the author?
Subject line: About the articles on your web site
Email address: swang (at) shermanlab (dot) com,
or from arxiv.org if you are a member

1: Dirac Notation, Fock Space and Riemann Metric Tensor in IR Models

HTML;   PDF: Current (06/21/2011);   Archived

Abstract

Using Dirac Notation as a powerful tool, we investigate the three classical Information Retrieval (IR) models and some their extensions. We show that almost all such models can be described by vectors in Occupation Number Representations (ONR) of Fock spaces with various specifications on, e.g., occupation number, inner product or term-term interactions. As an important cases of study, Concep Fock Spacs (CFS) is intruduced for Boolean Model; the basic formulas for Singular Value Decomposition (SVD) of Latent Semantic Indexing (LSI) Model are manipulated in terms of Dirac notation. And, based on SVD, a Riemannian metric tensor is introduced, which not only can be used to calculate the relevance of documents to a query, but also may be used to measure the closeness of documents in data clustering.


2: Probability Bracket Notation, Probability Vectors, Markov Chains and Stochestic Processes

PDF: Current (07/17/2007);   Archived

Abstract

Dirac notation has been widely used for vectors in Hilbert spaces of Quantum Theories. It now has also been introduced to Information Retrieval. In this paper, we propose a new set of symbols, the Probability Bracket Notation (PBN), for probability theories. We define new symbols like probability bra (p-bra), p-ket, p-bracket, sample base, unit operator, state ket and more as their counterparts in Dirac notation, which we refer as Vector Bracket Notation (VBN). By applying PBN to represent fundamental definitions and theorems for discrete and continuous random variables, we show that PBN could play the same role in probability sample space as Dirac notation in Hilbert space. We also find that there is a close relation between our probability state kets and probability vectors in Markov chains, which are involved in data clustering like Diffusion Maps .We summarize the similarities and differences between PBN and VBN in the two tables of Appendix A.


3: Induced Hilbert Space, Markov Chain, Diffusion Map and Fock Space in Thermophysics

PDF: Current (04/08/2007);   Archived

Abstract

In this article, we continue to explore Probability Bracket Notation (PBN), proposed in our previous article. Using both Dirac vector bracket notation (VBN) and PBN, we define induced Hilbert space and induced sample space, and propose that there exists an equivalence relation between a Hilbert space and a probability sample space constructed from the same base observable(s). Then we investigate Markov transition matrices and their eigenvectors to make diffusion maps with two examples: a simple graph theory example, to serve as a prototype of bidirectional transition operator; a famous text document example in IR literature, to serve as a tutorial of diffusion map in text document space. We notice that, in both examples, the sample space of the Markov chain and the Hilbert space spanned by the eigenvectors of the transition matrix are not equivalent. At the end, we apply our PBN and equivalence proposal to Thermophysics by associating phase space with Hilbert space or Fock space of many-particle systems.


4: Probability Bracket Notation: Term Vector Space, Concept Fock Space and Induced Probabilistic IR Models

PDF: Current (06/21/2011);   Archived  

Abstract

After a brief introduction to Probability Bracket Notation (PBN) for discrete random variables in time-independent probability spaces, we apply both PBN and Dirac notation to investigate probabilistic modeling for information retrieval (IR). We derive the expressions of relevance of document to query (RDQ) for various probabilistic models, induced by Term Vector Space (TVS) and by Concept Fock Space (CFS). The inference network model (INM) formula is symmetric and can be used to evaluate relevance of document to document (RDD); the CFS-induced models contain ingredients of all three classical IR models. The relevance formulas are tested and compared on different scenarios against a famous textbook example.


5: Probability Bracket Notation, Multivariable Systems and Static Bayesian Networks

PDF: Current (10/07/2012);   Archived  

Abstract

Probability Bracket Notation (PBN) is applied to systems of multiple random variables for preliminary study of static Bayesian Networks (BN) and Probabilistic Graphic Models (PGM). The famous Student BN Example is explored to show the local independences and reasoning power of a BN. Software package Elvira is used to graphically display the student BN. Our investigation shows that PBN provides a consistent and convenient alternative to manipulate many expressions related to joint, marginal and conditional probability distributions in static BN.


6: Probability Bracket Notation: Markov State Chain Projector, Hidden Markov Models and Dynamic Bayesian Networks

PDF: Current (12/06/2012);   Archived  

Abstract

After a brief discussion of Markov Evolution Formula (MEF) expressed in Probability Bracket Notation (PBN), its close relation with the joint probability distribution (JPD) of Visible Markov Models (VMM) is demonstrated by introducing Markov State Chain Projector (MSCP). The state basis and the observed basis are defined in the Sequential Event Space (SES) of Hidden Markov Models (HMM). The JPD of HMM is derived by using basis transformation in SES. The Viterbi algorithm is revisited and applied to the famous Weather HMM example , whose node graph and inference results are displayed by using software package Elvira . In the end, the formulas of VMM, HMM and some factorial HMM (FHMM) are expressed in PBN as instances of dynamic Bayesian Networks (DBN).


7: Thematic Clustering and the Dual Representations of Text Objects

PDF: Current (01/02/2017);  

Abstract

We introduce Thematic Clustering , a new methodology to discover clusters of a set of text documents and, at the same time, to define the theme of each cluster by using its top frequent keywords. Our procedure is based on the ideal of dual representations (TF rep and Concept rep) of text objects (docs or clusters) in term space. We derive cluster TF reps in initial clustering, use them to reduce term space and then renovate clusters. Our test results on three well-known data sets (Disease, Star and Reuters) are very promising: the formed clusters and their themes almost perfectly match our knowledge about the data sets.


Share this with your friends:

Post your comment via our blogs:

  


More to come, please visit us again!

Copyright © 2002-2016, Sherman Visual Lab