The Axioms of Quantum Mechanics
A post on the governing principles of our universe
March 2020.
The Axioms Under State Vector Formulation

Associated with any isolated physical system is a complex valued Hilbert space corresponding to the state space of the system. The system is completely described by its state vector which is a unit norm vector in the state space.
I denote the physical system’s state space with $\mathscr{H}$.

The evolution of a closed physical system is described by a unitary linear transformation. Thus, the state of the system at time $t_1$, given by its state vector $\ket{\psi_{t_1}}$, is transformed by some transformation, $U$, into the state $\ket{\psi_{t_2}}$ at time $t_2$ according to:
$$\ket{\psi_{t_2}}=U\ket{\psi_{t_1}}$$


A quantum measurement is described by a collection of linear operators $\{M_{\alpha_1},\dots,M_{\alpha_k}\}$ which act on the state space, $\mathscr{H}$. These operators must satisfy the completeness relation given by, $\sum_{i=1}^{k} M_{\alpha_i}^\dagger M_{\alpha_i}=I$. A quantum measurement on state $\ket{\psi}$ results in both a classical random variable and a residual quantum state.

The classical random variable, $X$, is described by the probability mass function:
$$p(x=\alpha_i)=\bra{\psi} M_{\alpha_i}^\dagger M_{\alpha_i}\ket{\psi}=\{M_{\alpha_i}\ket{\psi}}\^2$$

Upon observing outcome $\alpha_i$, the initial quantum state $\ket{\psi}$, collapses to:
$$\ket{\psi^\prime}=\frac{M_{\alpha_i}\ket{\psi}}{\sqrt{p(x=\alpha_i)}}$$
 The state space of a composite physical system (that is, a physical system made up of more than one constituent systems) is the tensor product of the constituent systems’ state spaces. If we have a set of constituent systems numbered $i=1,\dots,n$, each in some configuration $\ket{\psi_i}$, then the composite system is in the state given by $\ket{\Psi}=\ket{\psi_1}\otimes\dots\otimes\ket{\psi_n}$. With $\ket{\Psi}\in(\mathscr{H_1}\otimes\dots\otimes\mathscr{H_n})$.
The Axioms Under Density Operator Formulation

Associated with any isolated physical system is a complex valued Hilbert space corresponding to the state space of the system. The system is completely described by a density operator, $\rho$, acting on the Hilbert space, $\mathscr{H}$.
 The evolution of a closed physical system is described by a unitary linear transformation. Thus, the state of the system at time $t_1$, given by its density operator $\rho_{t_1}$, is transformed by some transformation, $U$, into the state $\rho_{t_2}$ at time $t_2$ according to:
$$\rho_{t_2}=U\rho_{t_1}U^\dagger$$


A quantum measurement is described by a collection of linear operators $\{M_{\alpha_1},\dots,M_{\alpha_k}\}$ which act on the state space, $\mathscr{H}$. These operators must satisfy the completeness relation given by, $\sum_{i=1}^{k} M_{\alpha_i}^\dagger M_{\alpha_i}=I$. A quantum measurement on state $\rho$ results in both a classical random variable and a residual quantum state.

The classical random variable, $X$ is described by the probability mass function:
$$p(x=\alpha_i)=tr(M_{\alpha_i}\rho M_{\alpha_i}^\dagger)$$

Upon observing outcome $\alpha_i$, the initial quantum state $\rho$, collapses to:
$$\rho^\prime=\frac{M_{\alpha_i}\rho M_{\alpha_i}^\dagger}{p(x=\alpha_i)}$$
 The state space of a composite physical system (that is, a physical system made up of more than one constituent systems) is the tensor product of the constituent systems’ state spaces. If we have a set of constituent systems numbered $i=1,\dots,n$, each described by some density operator $\rho_i$, then the composite system is in the state given by $\rho^\prime=\rho_1\otimes\dots\otimes\rho_n$. With $\rho^\prime\in(\mathscr{H_1}\otimes\dots\otimes\mathscr{H_n})$.
Density Operator Motivation
To motivate use of the density operator formulation in favor of the state vector formulation, consider the introduction of classical uncertainty. For this discussion, let us assume that we are observing some quantum system, $\rho$, and that we are able to prepare many instances of this system for repeated measurement.
If we as an observer are uncertain about the initial state of this system, we may model our uncertainty using a mixture of pure states. That is, we may denote $\rho=\sum_{i=1}^{n} p_i \ket{\psi_i}\bra{\psi_i}$ for some set of states $\{\ket{\psi_i}\}_{i=1}^n$ with associated probabilities $\{p_i\}_{i=1}^n$. For this discussion, we require $p_i\lt 1$ for all $i$.
We are interested in understanding what we may learn about $\rho$ to hopefully reduce our uncertainty. The third axiom of quantum mechanics tells us that through measurement, all we will be able to learn about the system is defined by the outcome distribution $P(X)$ and resulting set of quantum states, $\{\ket{\psi_i}\}_{i=1}^{n}$.
In other words, the third axiom tells us that the density operator representation of our state, $\rho$, paired with a collection of measurement operators, $\{M_{\alpha_1},\dots,M_{\alpha_k}\}$, encodes everything that we may learn about the system.
This is a very important point because it implies that if more than one ensembles correspond to the same density operator, then from our perspective as an observer, we will be unable to detect any measureable difference between the different ensembles.
As it turns out, mixed states may correspond to many different ensembles of pure states (each pair of which is related through a unitary operator known as unitary freedom in an ensemble).
Thus, the density operator formulation enables an observer to account for their classical uncertainty about the state of a quantum system. The state vector formulation assumes the observer is initially certain about the state of their system.
To illustrate this, consider the following example:
Ensemble #1:
$$
\begin{alignat}{2}
\ket{\psi_{11}}&=\begin{bmatrix}1 \\ 0\end{bmatrix}, \quad\quad &&p_{11}=\frac{3}{8}\nonumber\newline
\ket{\psi_{12}}&=\begin{bmatrix}0 \\ 1\end{bmatrix}, \quad\quad &&p_{12}=\frac{5}{8}\nonumber
\end{alignat}
$$
$$
\begin{align}
\rho_1 &= p_{11}\ket{\psi_{11}}\bra{\psi_{11}} + p_{12}\ket{\psi_{12}}\bra{\psi_{12}}\nonumber\newline
&= p_{11}\begin{bmatrix}1 \\ 0\end{bmatrix}\begin{bmatrix}1 & 0\end{bmatrix} + p_{12}\begin{bmatrix}0 \\ 1\end{bmatrix}\begin{bmatrix}0 & 1\end{bmatrix}\nonumber\newline
&= \frac{3}{8}\begin{bmatrix}1 & 0 \\ 0 & 0\end{bmatrix} + \frac{5}{8}\begin{bmatrix}0 & 0 \\ 0 & 1\end{bmatrix}\nonumber\newline
&=\begin{bmatrix}\frac{3}{8} & 0 \\ 0 & \frac{5}{8}\end{bmatrix}\nonumber
\end{align}$$
Ensemble #2:
$$
\begin{alignat}{2}
\ket{\psi_{21}}&=\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}, \quad\quad &&p_{21}=\frac{1}{2}\nonumber\newline
\ket{\psi_{22}}&=\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}, \quad\quad &&p_{22}=\frac{1}{2}\nonumber
\end{alignat}
$$
$$
\begin{align}
\rho_2 &= p_{21}\ket{\psi_{21}}\bra{\psi_{21}} + p_{22}\ket{\psi_{22}}\bra{\psi_{22}}\nonumber\newline
&= p_{21}\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} \begin{bmatrix}\sqrt{\frac{3}{8}} & \sqrt{\frac{5}{8}}\end{bmatrix} + p_{22}\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} \begin{bmatrix}\sqrt{\frac{3}{8}} & \sqrt{\frac{5}{8}}\end{bmatrix}\nonumber\newline
&= \frac{1}{2}\begin{bmatrix}\frac{3}{8} & \frac{\sqrt{3*5}}{8} \\ \frac{\sqrt{3*5}}{8} & \frac{5}{8}\end{bmatrix} + \frac{1}{2}\begin{bmatrix}\frac{3}{8} & \frac{\sqrt{3*5}}{8} \\ \frac{\sqrt{3*5}}{8} & \frac{5}{8}\end{bmatrix}\nonumber\newline
&=\begin{bmatrix}\frac{3}{8} & 0 \\ 0 & \frac{5}{8}\end{bmatrix}\nonumber
\end{align}
$$
This result shows $\rho_1=\rho_2$, which implies that for any particular measurement $\{M_{\alpha_1},\dots,M_{\alpha_k}\}$, we will be unable to distinguish between the states $\rho_1$ and $\rho_2$ despite the fact that they were defined by different ensembles of pure states.
In summary, for situations where we are unsure of the ensemble comprising our quantum state, we should rely on the density operator formulation to ensure we account for our initial uncertainty.
Extension of example to illustrate unitary freedom:
The theorem of unitary freedom in the ensemble asserts that two sets of normalized states, $\{\ket{\psi_{1i}}\}_{i=1}^{n}$ and $\{\ket{\psi_{2i}}\}_{i=1}^{n}$, with corresponding probabilities $\{p_{1i}\}_{i=1}^n$ and $\{p_{2i}\}_{i=1}^n$, describe the same density operator if and only if
$$
\sqrt{p_{2i}}\ket{\psi_{2i}}= \sum_{j=1}^n u_{ij} \sqrt{p_{1j}}\ket{\psi_{1j}}
$$
for some unitary matrix defined by $u_{ij}.$
I don’t include a general proof here at this time but consider the following unitary (and Hermitian) operator given by:
$$
U=\begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} \end{bmatrix}=U^\dagger
$$
It is easily verified to be unitary:
$$
UU^\dagger=U^\dagger U = U^2 = I
$$
Let $u_{ij}$ denote the entry in U at the $i$^{th} row and $j$^{th} column. Now observe
$$
\begin{alignat}{2}
&\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} &&= \frac{1}{\sqrt{2}} \sqrt{\frac{3}{8}} \begin{bmatrix}1 \\ 0\end{bmatrix} + \frac{1}{\sqrt{2}} \sqrt{\frac{5}{8}} \begin{bmatrix}0 \\ 1\end{bmatrix}\nonumber\newline
= &\sqrt{p_{21}} \ket{\psi_{21}} &&= u_{11} \sqrt{p_{11}} \ket{\psi_{11}} + u_{12} \sqrt{p_{12}} \ket{\psi_{12}}\nonumber
\end{alignat}
$$
and
$$
\begin{alignat}{2}
&\frac{1}{\sqrt{2}}\begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} &&= \frac{1}{\sqrt{2}} \sqrt{\frac{3}{8}} \begin{bmatrix}1 \\ 0\end{bmatrix} + \frac{1}{\sqrt{2}} \sqrt{\frac{5}{8}} \begin{bmatrix}0 \\ 1\end{bmatrix}\nonumber\newline
= &\sqrt{p_{22}} \ket{\psi_{22}} &&= u_{21} \sqrt{p_{11}} \ket{\psi_{11}} + u_{22} \sqrt{p_{12}} \ket{\psi_{12}}\nonumber
\end{alignat}
$$
Thus, we can see that for this particular example the unitary operator U relates the ensembles as described by the theorem of unitary freedom.
Further, the inverse relation can be found, first note that $\sum_{i=1}^n u_{ki}^* u_{ij} = \delta_{kj}$, then:
$$
\begin{align}
\sum_{i=1}^n u_{ki}^* \sqrt{p_{2i}} \ket{\psi_{2i}}&=\sum_{i=1}^n u_{ki}^* (\sum_{j=1}^n u_{ij} \sqrt{p_{1j}} \ket{\psi_{1j}})\nonumber\newline
&=\sum_{i=1}^n \sum_{j=1}^n u_{ki}^* u_{ij} \sqrt{p_{1j}} \ket{\psi_{1j}}\nonumber\newline
&=\sum_{j=1}^n \sum_{i=1}^n u_{ki}^* u_{ij} \sqrt{p_{1j}} \ket{\psi_{1j}}\nonumber\newline
&=\sum_{j=1}^n \sqrt{p_{1j}} \ket{\psi_{1j}} \sum_{i=1}^n (u_{ki}^* u_{ij})\nonumber\newline
&=\sum_{j=1}^n \sqrt{p_{1j}} \ket{\psi_{1j}} \delta_{kj}\nonumber\newline
&=\sqrt{p_{1k}} \ket{\psi_{1k}}\nonumber
\end{align}
$$
This is easily verified for our example noting that $U$ is Hermitian, $U=U^\dagger$,
$$
\begin{alignat}{2}
&\sqrt{\frac{3}{8}}\begin{bmatrix}1 \\ 0\end{bmatrix} &&= \frac{1}{\sqrt{2}} \frac{1}{\sqrt{2}} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} + \frac{1}{\sqrt{2}} \frac{1}{\sqrt{2}} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}\nonumber\newline
&\sqrt{\frac{3}{8}}\begin{bmatrix}1 \\ 0\end{bmatrix} &&= \frac{1}{2} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} + \frac{1}{2} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}\nonumber\newline
= &\sqrt{p_{11}} \ket{\psi_{11}} &&= u_{11}^* \sqrt{p_{21}} \ket{\psi_{21}} + u_{12}^* \sqrt{p_{22}} \ket{\psi_{22}}\nonumber
\end{alignat}
$$
and
$$
\begin{alignat}{2}
&\sqrt{\frac{5}{8}}\begin{bmatrix}0 \\ 1\end{bmatrix} &&= \frac{1}{\sqrt{2}} \frac{1}{\sqrt{2}} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix} + \frac{1}{\sqrt{2}} \frac{1}{\sqrt{2}} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}\nonumber\newline
&\sqrt{\frac{5}{8}}\begin{bmatrix}0 \\ 1\end{bmatrix} &&= \frac{1}{2} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}  \frac{1}{2} \begin{bmatrix}\sqrt{\frac{3}{8}} \\ \sqrt{\frac{5}{8}}\end{bmatrix}\nonumber\newline
= &\sqrt{p_{12}} \ket{\psi_{12}} &&= u_{21}^* \sqrt{p_{21}} \ket{\psi_{21}} + u_{22}^* \sqrt{p_{22}} \ket{\psi_{22}}\nonumber
\end{alignat}
$$
In this section I provide brief support for the equivalence of the two formulations.
 Both formulations of Axiom 1 taken as equivalent.

Let the state at $t_1$ be described by an ensemble of state vectors $\{\ket{\psi_1}_{t_1},\dots, \ket{\psi_n}_{t_1}\}$, with associated probabilities $\{p_1,\dots, p_n\}$.
By definition, the state at $t_1$ has density operator representation:
$$\rho_{t_1}=\sum_{i=1}^{n} p_i \ket{\psi_i}_{t_1}\bra{\psi_i}_{t_1}$$
According to Axiom 2 of the state vector formulation, each state making up the ensemble will evolve according to:
$$\ket{\psi_i}_{t_2}= U \ket{\psi_i}_{t_1}$$
This implies the state at $t_2$ is described by the ensemble of state vectors $\{U\ket{\psi_1}_{t_1},\dots, U\ket{\psi_n}_{t_1}\}$, with associated probabilities $\{p_1,\dots, p_n\}$.
The corresponding density operator for the state at $t_2$ is given by:
$$
\begin{align}
\rho_{t_2}&=\sum_{i=1}^{n} p_i \ket{\psi_i}_{t_2}\bra{\psi_i}_{t_2}\nonumber\newline
&=\sum_{i=1}^{n} p_i (\ket{\psi_i}_{t_2})(\ket{\psi_i}_{t_2})^\dagger\nonumber\newline
&=\sum_{i=1}^{n} p_i (U\ket{\psi_i}_{t_1}) (U\ket{\psi_i}_{t_1})^\dagger\nonumber\newline
&=\sum_{i=1}^{n} p_i U\ket{\psi_i}_{t_1} \bra{\psi_i}_{t_1} U^\dagger\nonumber\newline
&=U (\sum_{i=1}^{n} p_i \ket{\psi_i}_{t_1} \bra{\psi_i}_{t_1}) U^\dagger\nonumber\newline
&=U (\rho_{t_1}) U^\dagger\nonumber
\end{align}
$$
Thus, the two descriptions of unitary time evolution are equivalent.

Let the state under observation be denoted by $\rho$ and described by an ensemble of state vectors $\{\ket{\psi_1},\dots, \ket{\psi_n}\}$, with associated probabilities $\{p_1,\dots, p_n\}$.
We can then use the state vector formulation to describe the following conditional probability: (the second statement below can be established by calculating the trace using an orthonormal basis which has $(M_{\alpha_i}\ket{\psi_j})$ as one of its elements)
$$\begin{align}
p(x=\alpha_i \mid \rho=\ket{\psi_j}\bra{\psi_j}) &= \bra{\psi_j}M_{\alpha_i}^\dagger M_{\alpha_i}\ket{\psi_j}\nonumber\newline
&=tr(M_{\alpha_i}\ket{\psi_j} \bra{\psi_j}M_{\alpha_i}^\dagger)\nonumber
\end{align}
$$
Equipped with this we can define the joint distribution,
$$
\begin{align}
\implies p(x=\alpha_i, \rho=\ket{\psi_j}\bra{\psi_j}) &= p(\rho=\ket{\psi_j}\bra{\psi_j}) \times p(x=\alpha_i \mid \rho=\ket{\psi_j}\bra{\psi_j})\nonumber\newline
&= p_{j} \times tr(M_{\alpha_i}\ket{\psi_j} \bra{\psi_j}M_{\alpha_i}^\dagger)\nonumber\newline
&= tr(p_{j} \times M_{\alpha_i}\ket{\psi_j} \bra{\psi_j}M_{\alpha_i}^\dagger)\nonumber\newline
&= tr(M_{\alpha_i}(p_{j} \ket{\psi_j} \bra{\psi_j}) M_{\alpha_i}^\dagger)\nonumber
\end{align}
$$
Finally, we can marginalize the joint distribution over $\rho$ to obtain our probability mass function,
$$
\begin{align}
\implies p(x=\alpha_i) &= \sum_{j=1}^n p(x=\alpha_i, \rho=\ket{\psi_j}\bra{\psi_j})\nonumber\newline
&= \sum_{j=1}^n tr(M_{\alpha_i}(p_{j} \ket{\psi_j} \bra{\psi_j}) M_{\alpha_i}^\dagger)\nonumber\newline
&= tr(\sum_{j=1}^n M_{\alpha_i}(p_{j} \ket{\psi_j} \bra{\psi_j}) M_{\alpha_i}^\dagger)\nonumber\newline
&= tr( M_{\alpha_i}(\sum_{j=1}^n p_{j} \ket{\psi_j} \bra{\psi_j}) M_{\alpha_i}^\dagger)\nonumber\newline
&= tr( M_{\alpha_i}(\rho) M_{\alpha_i}^\dagger)\nonumber
\end{align}
$$
Thus, we have derived the probability mass function as stated in the density operator formulation.
Using the same initial state $\rho$ let’s consider the state collapse as prescribed by the state vector formulation.
According to the vector formulation, if our state was $\ket{\psi_j}$ before taking the measurement and we observe outcome $\alpha_i$, then our state will collapse to $\ket{\psi_j^\prime}=\frac{M_{\alpha_i}\ket{\psi_j}}{\sqrt{p(x=\alpha_i)}}$.
Since our state $\rho$ begins as a mixture of pure states, the resulting ensemble after observing outcome $\alpha_i$ is given by:
$$\begin{align}
\rho^\prime &= \sum_{j=1}^n p_j \ket{\psi_j}\bra{\psi_j}\nonumber\newline
&= \sum_{j=1}^n p_j (\ket{\psi_j})(\ket{\psi_j})^\dagger\nonumber\newline
&= \sum_{j=1}^n p_j (\frac{M_{\alpha_i}\ket{\psi_j}}{\sqrt{p(x=\alpha_i)}})(\frac{M_{\alpha_i}\ket{\psi_j}}{\sqrt{p(x=\alpha_i)}})^\dagger\nonumber\newline
&= \sum_{j=1}^n p_j (\frac{M_{\alpha_i}\ket{\psi_j}}{\sqrt{p(x=\alpha_i)}})(\frac{\bra{\psi_j}M_{\alpha_i}^\dagger}{\sqrt{p(x=\alpha_i)}})\nonumber\newline
&= \sum_{j=1}^n p_j \frac{M_{\alpha_i}\ket{\psi_j} \bra{\psi_j}M_{\alpha_i}^\dagger}{p(x=\alpha_i)}\nonumber\newline
&= \frac{\sum_{j=1}^n p_j M_{\alpha_i}\ket{\psi_j} \bra{\psi_j}M_{\alpha_i}^\dagger}{p(x=\alpha_i)}\nonumber\newline
&= \frac{M_{\alpha_i}(\sum_{j=1}^n p_j \ket{\psi_j} \bra{\psi_j}) M_{\alpha_i}^\dagger}{p(x=\alpha_i)}\nonumber\newline
&= \frac{M_{\alpha_i}(\rho) M_{\alpha_i}^\dagger}{p(x=\alpha_i)}\nonumber
\end{align}
$$
We have now fully derived the density operator formulation of Axiom 3 from the state vector formulation.
 Equivalene of Axiom 4 follows from the definition of a tensor product Hilbert space.