%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%% A Proof of Everett's Correlation Conjecture. %%% %%% August 2010 %%% %%% %%% Plain TeX, 9 pages %%% %%% Matthew J. Donald %%% %%% web site: http://people.bss.phy.cam.ac.uk/~mjd1014 %%% %%% e-mail : mjd1014@cam.ac.uk %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %\count17=0 %%to use pdfTeX comment out this line and uncomment the next \count17=1 \pdfoutput=\count17 %%to use plain TeX comment out this line %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% and uncomment the previous %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \ifnum\count17=1 \def\cmykBlue{1 1 0 0} \def\cmykBlack{0 0 0 1} \def\Blue{\pdfsetcolor{\cmykBlue}} \def\Black{\pdfsetcolor{\cmykBlack}} \def\pdfsetcolor#1{\pdfliteral{#1 k}} \def\setcolor#1{\mark{#1}\pdfsetcolor{#1}} \def\maincolor{\cmykBlack} \pdfsetcolor{\maincolor} \pdfinfo { /Title (A Proof of Everett's Correlation Conjecture.) /Author (Matthew J. Donald) /CreationDate (August 2010) /ModDate (\number\year/\number\month/\number\day) /Subject (Quantum Information Theory.) /Keywords (quantum theory, correlation)} \fi \magnification=1200 \ifnum\count17=0 \fi \hsize=13cm \def\newline{\hfil\break} \def\newpage{\vfil\eject} \def\proclaim#1#2{\medskip\noindent{\bf #1}\quad \begingroup #2} \def\endproclaim{\endgroup\medskip} \def\proof{\noindent{\sl proof}\quad} \def\<{{<}} \def\>{{>}} \font\Bbb =msbm10 \def\Real{{\hbox{\Bbb R}}} \def\Complex{{\hbox {\Bbb C}}} \def\calH{{\cal H}} \def\calB{{\cal B}} \def\calZ{{\cal Z}} \def\calE{{\cal E}} \def\tr{{\rm tr}} \def\Prob{{\rm P}} \def\implies{\Rightarrow} \def\parasign{\S} \def\blacksquare{\vrule height 4pt width 3pt depth2pt} \def\dsize{\displaystyle} \abovedisplayskip=3pt plus 1pt minus 1pt \belowdisplayskip=3pt plus 1pt minus 1pt \def\hcrh{\hfill \cr \hfill} \def\crh{\cr \hfill} \def\hcr{\hfill \cr} \def\ent#1#2#3{\hbox{ent}_{#1}(#2\,|\,#3)} \font\brm=cmbx12 \def\tilbf{\lower 1.1 ex\hbox{\brm \char'176}} \font\trm=cmr12 \def\tilrm{\lower 1.1 ex\hbox{\trm \char'176}} \ifnum\count17=1 \def\link#1#2{\leavevmode\pdfstartlink attr{/Border [0 0 0]} goto name{#1}\setcolor\cmykBlue #2\pdfendlink\setcolor\cmykBlack} \def\name#1{\pdfdest name{#1} xyz} \def\outlink#1#2{\leavevmode \pdfstartlink attr{/Border [0 0 0]} user{/Subtype /Link /A << /S /URI /URI (#1) >>} \setcolor\cmykBlue #2\pdfendlink\setcolor\cmykBlack} \def\pdfeject{\eject} \else \def\link#1#2{{#2}} \def\outlink#1#2{{#2}} \def\name#1{} \def\pdfeject{} \fi \headline={\hfil} \footline={\hss\tenrm\folio\hss} {\bf \centerline{A Proof of Everett's Correlation Conjecture. %\ %\link{*}{*}\footnote{}{\tenrm * %August 2010.} } \medskip \centerline{Matthew J. Donald} \medskip \centerline{The Cavendish Laboratory, JJ Thomson Avenue,} \centerline{Cambridge CB3 0HE, Great Britain.} \smallskip \centerline{ e-mail: \quad mjd1014@cam.ac.uk} \smallskip {\bf \hfill web site:\quad \catcode`\~=12 \outlink{http://people.bss.phy.cam.ac.uk/~mjd1014} {http://people.bss.phy.cam.ac.uk/\tilbf mjd1014} \hfill }} \bigskip \noindent{\bf Abstract} \quad In his long \link{ref}{1957 paper}, ``The Theory of the Universal Wave Function'', Hugh Everett III made some significant preliminary steps towards the application and generalization of Shannon's information theory to quantum mechanics. In the course of doing so, he conjectured that, for a given wavefunction on a compound space, the Schmidt decomposition maximises the correlation between subsystem bases. This is proved here. \bigskip Let $\calH_1$ and $\calH_2$ be separable Hilbert spaces and $\calH = \calH_1 \otimes \calH_2$ be their tensor product. Let $\Psi \in \calH$ be a wavefunction -- by which I mean simply that $||\Psi|| = 1$. Suppose that $\calH_1$ has dimension $D_1 \le \infty$ and $\calH_2$ has dimension $D_2 \le \infty$. Without loss of generality, suppose that $D_1 \le D_2$. A Schmidt decomposition (\link{ref}{von Neumann 1932}, \link{ref}{Everett 1957}, and many modern textbooks) of $\Psi$ is an expansion of the form $\Psi = \sum_{n = 1}^{D_1} \sqrt{p_n} \varphi_n \psi_n$ where $(\varphi_n)_{n=1}^{D_1}$ is an orthornormal basis of $\calH_1$, $(\psi_n)_{n=1}^{D_2}$ is an orthornormal basis of $\calH_2$, $0 \le p_n \le 1$ and $\sum_{n=1}^{D_1} p_n = 1$. Schmidt decompositions always exist. They are unique, up to phase factors, as long as $D_1 = D_2$ and the $p_n$ are all distinct. Mixtures of notations from both mathematics and physics will be used and abused throughout this note. For example, $\varphi_n \psi_n$ is written here for $\varphi_n \otimes \psi_n$. In notation used later, the same wavefunction would appear as $|\varphi_n, \psi_n\>$ or simply as $|n, n\>$. Set $\sigma = |\Psi\>\<\Psi|$. $\sigma$ is a pure state on the algebra $\calB(\calH)$ of bounded operators on $\calH$ and for $B \in \calB(\calH)$ we shall write $\sigma(B) = \<\Psi|B|\Psi\> = \tr(|\Psi\>\<\Psi| B)$. Define $\sigma_1$ to be the reduced density matrix of $\sigma$ on $\calH_1$. In other words, $\sigma_1 = \tr_{\calH_2}(\sigma)$ is the partial trace of $\sigma$ over $\calH_2$ , and, for all $B_1 \in \calB(\calH_1)$, $\sigma_1(B_1) = \sigma(B_1 \otimes 1_2)$ where $1_2$ is the identity operator on $\calH_2$. $\sigma_2 = \tr_{\calH_2}(\sigma)$ and $1_1$ are defined similarly. The Schmidt decomposition gives $$\sigma_1 = \sum_{n=1}^{D_1} p_n |\varphi_n\>\<\varphi_n| \quad \hbox{ and } \quad \sigma_2 = \sum_{n=1}^{D_1} p_n |\psi_n\>\<\psi_n|.$$ Suppose that $\calZ_1 = (P_i)_{i = 1}^I$ (respectively $\calZ_2 = (Q_j)_{j = 1}^J$) is a sequence \name{*}of orthogonal projections in $\calB(\calH_1)$ (resp.~$\calB(\calH_2)$) such that $\sum_{i = 1}^I P_i = 1_1$ (resp.~$\sum_{j = 1}^J Q_j = 1_2$). We shall write $P_i Q_j$ for $P_i \otimes Q_j \in \calB(\calH_1 \otimes \calH_2)$. In particular, write $\widetilde P_n = |\varphi_n\>\<\varphi_n|$ for $n = 1, \dots, D_1$, $\widetilde Q_n = |\psi_n\>\<\psi_n|$ for $n = 1, \dots, D_2$, $\widetilde\calZ_1 = (\widetilde P_n)_{n=1}^{D_1}$, and $\widetilde\calZ_2 = (\widetilde Q_n)_{n=1}^{D_2}$. In all \name{1}cases, $$\sum_{j =1}^J \sigma(P_i Q_j) = \sigma(P_i \otimes 1_2) = \sigma_1(P_i) \quad \hbox{ and } \quad \sum_{i=1}^I \sigma(P_i Q_j) = \sigma(1_1 \otimes Q_j) = \sigma_2(P_j). \eqno{(1)}$$ When discussing relative entropies, we shall also use $\calZ_1$ and $\calZ_2$ to denote the abelian von Neumann algebras generated by $(P_i)_{i = 1}^I$ and $(Q_j)_{j = 1}^J$, while $\calZ$ will denote the abelian von Neumann algebra $\calZ_1 \otimes \calZ_2 \subset \calB(\calH)$ generated \name{2}by $(P_i Q_j)_{i = 1}^I{}_{j = 1}^J$. Define $$\displaylines{ \{\calZ_1, \calZ_2\}_{\Psi} = \sum_{i, j} \big[\sigma(P_i Q_j) \log \sigma(P_i Q_j) - \sigma(P_i Q_j) \log (\sigma_1(P_i)\sigma_2(Q_j)) \hcrh - \sigma(P_i Q_j) + \sigma_1(P_i)\sigma_2(Q_j)\big]. \qquad \llap(2) }$$ Of course $\sum_{i, j} \sigma(P_i Q_j) = \sum_{i, j} \sigma_1(P_i)\sigma_2(Q_j) = 1$ so that the final terms in definition \link{1}{(1)} are redundant in the finite-dimensional case. They are added for the general case, however, because, then, by the standard inequality $$s \geq 0, r \geq 0 \implies s \log s - s \log r - s + r \geq 0,$$ each term in the sum is non-negative, and the sum is well-defined even if it is infinite. Note that with the convention that $0 \log 0 = 0$, each term in the sequence is finite because $$\sigma_1(P_i)\sigma_2(Q_j) = 0 \implies \sigma(P_i Q_j) = 0.$$ This holds by the Cauchy-Schwarz inequality, or alternatively because, $\sigma(P_i Q_j) \ge 0$ for each $j$ and so, by \link{1}{(1)}, $$\sigma_1(P_i) = 0 \implies \sigma(P_i Q_j) = 0 \hbox{ for all } j.$$ \link{2}{(2)} is the mutual information of random variables $X$ on $\{i = 1, \dots, I\}$ and $Y$ on $\{j = 1, \dots, J\}$ with joint distribution $\Prob_{joint}(i, j) = \sigma(P_i Q_j)$ (\link{ref}{Cover and Thomas 1991}). In other words, with the convention in which relative entropy is negative, it is the absolute value of the relative entropy of the joint distribution $\Prob_{joint}(i, j)$ with respect to the product distribution of the marginals $\Prob_{marg}(i, j) = \sigma_1(P_i)\sigma_2(Q_j)$. This means that the greater the mutual information, the more different the joint distribution is from the product distribution, and thus the more that $X$ and $Y$ are correlated. Generalizing to the quantum case, Everett considers operators $A = \sum_i a_i P_i$ on $\calH_1$ and $B = \sum_j b_j Q_j$ on $\calH_2$ and calls $\{\calZ_1, \calZ_2\}_{\Psi}$ the correlation between $A$ and $B$ on $\Psi$. He then conjectures \name{3}theorem 3. \proclaim{Theorem 3} $$0 \le \{\calZ_1, \calZ_2\}_{\Psi} \le \{\widetilde \calZ_1, \widetilde \calZ_2\}_{\Psi} = -\sum_n p_n \log p_n.$$ \endproclaim \proof Begin by assuming that $\dim \calH_1 = \dim \calH_2 = D < \infty$. Using \link{1}{(1)}, we can \name{4}write the result in the form $$\displaylines{ - \sum_j \sigma_2(Q_j) \log \sigma_2(Q_j) - \sum_i \sigma_1(P_i) \log \sigma_1(P_i) \hcrh \le - \sum_n p_n \log p_n - \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j). \qquad \llap(4) }$$ Inequalities of this sort between quantum entropies involving sequences of operators which need not commute such as the $(P_i)$ and $(\widetilde P_n)$ and the $(Q_j)$ and $(\widetilde Q_n)$ can be difficult to prove (even when they are true). The fundamental result in this area is strong subadditivity \name{5}conjectured, in a statistical mechanical context, by \link{ref}{Lanford and Robinson (1968)} and proved by \link{ref}{Lieb and Ruskai (1973)}. This says that if we have a state $\rho$ on a Hilbert space $\calH = \calH_a \otimes \calH_b \otimes \calH_c$ and we define $\rho_b = \tr_{\calH_a \otimes \calH_c}(\rho)$, $\rho_{ab} = \tr_{\calH_c}(\rho)$, and $\rho_{bc} = \tr_{\calH_a}(\rho)$, then $$S(\rho) + S(\rho_{b}) \le S(\rho_{ab}) + S(\rho_{bc}) \eqno{(5)}$$ where, for any state $\omega$, $S(\omega) = - \tr(\omega \log \omega)$. Suppose that $\dim \calH_a = D < \infty$, and let $\tau_a$ be the totally mixed state on $\calH_a$. For $\sigma$ and $\omega$ any states on a Hilbert space $\calH$, the relative entropy of $\sigma$ with respect to $\omega$ is defined as $$\ent{\calB(\calH)}{\sigma}{\omega} = \tr(-\sigma \log \sigma + \sigma \log \omega).$$ This definition uses the convention that relative entropy is negative. In \link{ref}{Donald (1986)} and \link{ref}{Donald (1992)}, I explain that $\exp\{\ent{\calB(\calH)}{\sigma}{\omega}\}$ can be interpreted as the probability per trial of information given by operators in $\calB(\calH)$ of mistaking the state $\omega$ for the state $\sigma$. To apply this to understand why equation \link{5}{(5)} should be true, note that, using the \name{6}definition of the partial trace, $$\ent{\calB(\calH)}{\rho}{\tau_a \otimes \rho_{bc}} = S(\rho) - S(\rho_{bc}) - \log D$$ and $$\ent{\calB(\calH_{ab})}{\rho_{ab}}{\tau_a \otimes \rho_{b}} = S(\rho_{ab}) - S(\rho_{b}) - \log D.$$ This means that \link{5}{(5)} is equivalent to $$\ent{\calB(\calH)}{\rho}{\tau_a \otimes \rho_{bc}} \le \ent{\calB(\calH_{ab})}{\rho_{ab}}{\tau_a \otimes \rho_{b}}. \eqno{(6)}$$ \link{6}{(6)} can be interpreted as saying that we are more likely to mistake the state $\tau_a \otimes \rho_{bc}$ for the state $\rho$ if we can look at all the operators in $\calB(\calH)$ than if we just get to look at the operators in $\calB(\calH_{ab})$. In other words, \link{6}{(6)} is an example of the monotonicity of the relative entropy; a fundamental result with a wide variety of proofs and extensions (\link{ref}{Ohya and Petz 1993}). The difficulty now is to equate each of the terms in \link{4}{(4)} with the entropy of a state restriction as in \link{5}{(5)}. But first some preliminaries: \medskip $\sigma(\widetilde P_n \widetilde Q_m) = p_n \delta_{n, m}$ and $\sigma_1(\widetilde P_n) = \sigma_2(\widetilde Q_n) = p_n$ so that $$\{\widetilde \calZ_1, \widetilde \calZ_2\}_{\Psi} = \sum_n p_n \log p_n/(p_n)^2 = -\sum_n p_n \log p_n.$$ This determines the right hand side of theorem \link{3}{3}, and shows that, like the sequence $(p_n)$, it is independent of the choice of Schmidt decomposition of $\Psi$. \proclaim{remark} It is tempting to try to prove Everett's conjecture using the monotonicity of the relative entropy under quantum operations (\link{ref}{Lindblad 1974}, \link{ref}{Uhlmann 1977}), with the quantum operation $\calE: \calB(\calH) \to \calB(\calH)$ defined by $\calE(B) = \sum_{ij} P_i Q_j B P_i Q_j$, or alternatively, by using monotonicity under restriction to the subalgebra $\calZ$. Everett's conjecture, however, seems remarkably strong in the sense that a direct use of these theorems just gives $$\ent{\calB(\calH)}{\sigma}{\sigma_1 \otimes \sigma_2} \le \ent{\calB(\calH)}{\sigma \circ \calE}{(\sigma_1 \otimes \sigma_2)\circ \calE} \le \ent{\calZ}{\sigma}{\sigma_1 \otimes \sigma_2} \le 0.$$ The useful terms here work out as $$ 2\sum_n p_n \log p_n = \ent{\calB(\calH)}{\sigma}{\sigma_1 \otimes \sigma_2} \le \ent{\calZ}{\sigma}{\sigma_1 \otimes \sigma_2} = - \{\calZ_1, \calZ_2\}_{\Psi} \le 0.$$ The problematic factor of 2 arises because $\sigma$ is a pure state on the algebra $\calB(\calH)$. $$-\ent{\calB(\calH)}{\sigma}{\sigma_1 \otimes \sigma_2} = S(\sigma_1) + S(\sigma_2) - S(\sigma)$$ is also the quantum mutual information in this situation, so various inequalities involving that property are also not quite as strong as Everett's conjecture. $\{\widetilde \calZ_1, \widetilde \calZ_2\}_{\Psi}$ is in fact the negative of the relative entropy $\ent{\widetilde \calZ}{\sigma}{\sigma_1 \otimes \sigma_2}$ (or in strict notation $\ent{\widetilde \calZ}{\sigma|_{\widetilde \calZ}}{\sigma_1|_{\widetilde \calZ_1} \otimes \sigma_2|_{\widetilde \calZ_2}}$) where, as von Neumann algebras, $\widetilde \calZ = \widetilde\calZ_1 \otimes \widetilde\calZ_2$. Despite this failure, the quantum operation $\calE$ and its components $\calE^1$ and $\calE^2$, defined by $\calE^1(B_1) = \sum_i P_i B_1 P_i $ and $\calE^2(B_2) = \sum_j Q_j BQ_j$, are central to the proof of the full result, in which we will use standard techniques to represent $\calE^1$ and $\calE^2$ as compositions of unitary maps and partial traces. There are strong similarities between the proof I shall give here, and \link{ref}{Schumacher and Nielsen's (1996)} proof of the quantum data processing inequality. Nevertheless, it is not clear to me that Everett's inequality can be interpreted as an application of Schumacher and Nielsen's result. On the other hand, it may well come within the scope of exercise 12.15 of \link{ref}{Nielsen and Chuang (2000)} which begins: ``Apply all possible combinations of the subadditivity and strong subadditivity inequalities to deduce other inequalities for [a] two stage quantum process''! Over the last twenty years, the literature on quantum information theory has explored a vast variety of such combinations to the point where I would quite surprised if Everett's inequality was not out there somewhere. As yet, however, I have failed to find it. Anyway, it seems worth having a leisurely exposition of a direct and fairly simple proof linked to the {\catcode`\~=12 \catcode`\#=12 \outlink{http://people.bss.phy.cam.ac.uk/~mjd1014/books.html#42}{historical context.}} \endproclaim \proclaim{lemma} Suppose that $P_i = \sum_k P_{i, k}$. Then $$\displaylines{ \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j)/(\sigma_1(P_i)\sigma_2(Q_j)) \hcrh \le \sum_{i, j, k} \sigma(P_{i, k} Q_j) \log \sigma(P_{i, k} Q_j) /(\sigma_1(P_{i, k}) \sigma_2(Q_j)). }$$ \endproclaim \proof By the log sum inequality (\link{ref}{Everett 1957 Appendix I.2 lemma 2}, \link{ref}{Cover and Thomas 1991 theorem 2.71}) which is a consequence of the convexity of $x \log x$, $$\displaylines{ \sigma(P_i Q_j) \log \sigma(P_i Q_j)/(\sigma_1(P_i)\sigma_2(Q_j)) \hcr = \sum_k \sigma(P_{i, k} Q_j) \log \sum_{k'} \sigma(P_{i, k'} Q_j) / \sum_{k''}(\sigma_1(P_{i, k''})\sigma_2(Q_j)) \crh \le \sum_k \sigma(P_{i, k} Q_j) \log \sigma(P_{i, k} Q_j) / ( \sigma_1(P_{i, k})\sigma_2(Q_j)). \hfill \blacksquare }$$ It follows that it is sufficient to prove the theorem for $P_i$ and $Q_j$ one-dimensional projections. Assume therefore that $P_i = |\alpha_i\>\<\alpha_i|$ and $Q_j = |\beta_j\>\<\beta_j|$ where $\{ \alpha_i: i = 1, \dots, D\}$ (resp.~$\{\beta_j: j = 1, \dots, D\}$) is an orthonormal basis of $\calH_1$ (resp.~$\calH_2$). \medskip Now return to the problem of finding a way to express each term in \link{4}{(4)} as the entropy of a state restriction. Note first that there is a similarity between the two terms on the left of \link{4}{(4)} which does not seem to be matched by the form of the spaces used in \link{5}{(5)}. This issue can be dealt with by introducing a fourth space $\calH_d$ and a pure state on the total space $\calH_a \otimes \calH_b \otimes \calH_c \otimes \calH_d$. Using a Schmidt decomposition shows that the entropy of the partial trace of that pure state over $\calH_d$ will equal the entropy of its partial trace over $\calH_a \otimes \calH_b \otimes \calH_c$. Introduce auxiliary spaces $\calH'_1$ and $\calH'_2$ with $\dim \calH'_1 = \dim \calH'_2 = D$ and orthonormal bases $\{|i\>: i = 0, \dots, D -1\}$ and set $$|\Psi_0\> = |0\> \otimes |\Psi\> \otimes |0\> \in \calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2.$$ Define a linear map $U$ on $\calH'_1 \otimes \calH_1$ by linear extension from $$U(|n\> \otimes |\alpha_i\>) = |n \oplus i\> \otimes |\alpha_i\>$$ where $\oplus$ is addition modulo $D$. $U$ is defined on an orthonormal basis, which it maps to another orthonormal basis: $$\displaylines{ \ = \\<\alpha_{i'} |\alpha_i\> = \ \delta_{i,i'} = \delta_{m,n} \delta_{i,i'} }$$ and so $U$ is unitary. Note that, for $\varphi \in \calH_1$, $$U(|0\> \otimes |\varphi\>) = U(\sum_i |0\> \otimes P_i |\varphi\>) = \sum_i | i\> \otimes P_i |\varphi\> = \sum_i \<\alpha_i| \varphi\> |i, \alpha_i \>.$$ Similarly, define a unitary map $V$ on $\calH_2 \otimes \calH'_2$ by $$V(|\beta_j\> \otimes |n\>) = |\beta_j\> \otimes |n \oplus j\>$$ and note \name{7}that $$V(|\psi\> \otimes |0\>) = \sum_j Q_j |\psi\> \otimes |j\> = \sum_j \<\beta_j| \psi\> |\beta_j, j \>.$$ Set $|\Psi'\> = (U \otimes V)|\Psi_0\>$. Then $$\displaylines{ \crh |\Psi'\> = \sum_n \sqrt{p_n} U(|0\> \otimes |\varphi_n\>) \otimes V(|\psi_n\> \otimes |0\>) \hfill \llap(7) \crh = \sum_{i j n} \sqrt{p_n} |i\> \otimes P_i |\varphi_n\> \otimes Q_j|\psi_n\> \otimes |j\> = \sum_{i j} |i\> \otimes P_i Q_j|\Psi\> \otimes |j\>. }$$ Now set $\rho = |\Psi'\>\<\Psi'|$. It is straightforward to calculate all of the partial traces of this state, either directly, or by identifying Schmidt decompositions. Note that as that $\rho$ is pure, if $A \subset \{1', 1, 2, 2'\}$ and $A^c$ is its complement, a Schmidt decomposition of $|\Psi'\>$ will show that $S(\rho_{A}) = S(\rho_{A^c})$. The results give $$\eqalign{ S(\rho_{1}) &= S(\rho_{1'}) = S(\rho_{1'22'}) = S(\rho_{122'}) = - \sum_i \sigma_1(P_i) \log \sigma_1(P_i) \cr S(\rho_2) &= S(\rho_{2'}) = S(\rho_{1'12'}) = S(\rho_{1'12}) = - \sum_j \sigma_2(Q_j) \log \sigma_2(Q_j) \cr S(\rho_{1'2}) &= S(\rho_{12}) = S(\rho_{12'}) = S(\rho_{1'2'}) = - \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j) \cr S(\rho_{1'1}) &= S(\rho_{22'}) = - \sum_n p_n \log p_n. \cr }$$ Several choices are now available to apply \link{5}{(5)}, including $a = 1'$, $b = 1$ and $c = 2$, and so the theorem is proved in the finite-dimensional case. To confirm that the entropies of the partial traces are as given, first note that, as $U$ and $V$ are unitary, \link{7}{(7)} provides the Schmidt decomposition of $|\Psi'\>$ with respect to the decomposition $$(\calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2) = (\calH'_1 \otimes \calH_1) \otimes (\calH_2 \otimes \calH'_2).$$ This gives $S(\rho_{1'1}) = S(\rho_{22'}) = - \sum_n p_n \log p_n$. Next, write $$\displaylines{ |\Psi'\> = \sum_{i j} |i\> \otimes P_i Q_j|\Psi\> \otimes |j\> = \sum_i |i\> \otimes (\sum_j \<\alpha_i, \beta_j|\Psi\> |\alpha_i, \beta_j, j\>). }$$ For $i \ne i'$, the vectors $\sum_j \<\alpha_i, \beta_j|\Psi\> |\alpha_i, \beta_j, j\>$ and $\sum_j \<\alpha_{i'}, \beta_j|\Psi\> |\alpha_{i'}, \beta_j, j\>$ are orthogonal. Moreover, $$|| \sum_j \<\alpha_i, \beta_j|\Psi\> |\alpha_i, \beta_j, j\> ||^2 = \sum_j |\<\alpha_i, \beta_j|\Psi\>|^2 = \sum_j \sigma(P_i Q_j) = \sigma_1(P_i)$$ so we have the Schmidt decomposition of $|\Psi'\>$ with respect to the decomposition $$(\calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2) = (\calH'_1) \otimes ( \calH_1 \otimes \calH_2 \otimes \calH'_2)$$ giving $S(\rho_{1'}) = S(\rho_{122'}) = - \sum_i \sigma_1(P_i) \log \sigma_1(P_i)$. For the decomposition $$(\calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2) = (\calH'_1 \otimes \calH'_2) \otimes ( \calH_1 \otimes \calH_2),$$ $$|\Psi'\> = \sum_{i j} |i\> \otimes P_i Q_j|\Psi\> \otimes |j\>$$ already presents a Schmidt decomposition, as for $(i, j) \ne (i', j')$ the vectors $P_i Q_j|\Psi\>$ and $P_{i'} Q_{j'}|\Psi\>$ are orthogonal. $||P_i Q_j|\Psi\>||^2 = \sigma(P_i Q_j)$ and so $$S(\rho_{12}) = S(\rho_{1'2'}) = - \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j).$$ With the underlying symmetry of formalism under the exchange of $\calH_1$ and $(P_i)_i$ with $\calH_2$ and $(Q_j)_j$, it only remains to consider the decompositions $$(\calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2) = (\calH_1) \otimes ( \calH'_1 \otimes \calH_2 \otimes \calH'_2)$$ and $$(\calH'_1 \otimes \calH_1 \otimes \calH_2 \otimes \calH'_2) = (\calH'_1 \otimes \calH_2) \otimes ( \calH_1 \otimes \calH'_2).$$ For these, write $$\displaylines{ |\Psi'\> = \sum_{i j} \<\alpha_i, \beta_j|\Psi\> |i, \alpha_i, \beta_j, j\>. }$$ For $i \ne i'$, the pairs $|\alpha_i\>$ and $|\alpha_{i'}\>$ and $\sum_j \<\alpha_i, \beta_j|\Psi\> |i, \beta_j, j\>$ and \newline $\sum_j \<\alpha_{i'}, \beta_j|\Psi\> |i', \beta_j, j\>$ are orthogonal, and $$||\sum_j \<\alpha_i, \beta_j|\Psi\> |i, \beta_j, j\>||^2 = \sum_j |\<\alpha_i, \beta_j|\Psi\>|^2 = \sum_j \sigma(P_i Q_j) = \sigma_1(P_i).$$ This yields $S(\rho_{1}) = S(\rho_{1'22'}) = - \sum_i \sigma_1(P_i) \log \sigma_1(P_i)$. For $(i, j) \ne (i', j')$, the pairs $|i \beta_j\>$ and $|i' \beta_{j'}\>$ and $|\alpha_i, j\>$ and $|\alpha_{i'}, j'\>$ are orthogonal, and $$|\<\alpha_i, \beta_j|\Psi\>|^2 = \sigma(P_i Q_j)$$ so that $S(\rho_{1'2}) = S(\rho_{12'}) = - \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j)$. \medskip Before dealing with the extension to infinite dimensions, note that the restriction $\dim \calH_1 = \dim \calH_2$ has only been used for notational convenience. Considering subspaces of such a situation is sufficient to yield the result for any pair of finite dimensional spaces. \proclaim{lemma} The theorem holds whenever the sequences $(P_i)_{i=1}^I$ and $(Q_j)_{j=1}^J$ are finite, and there are only finitely many $p_n > 0$. \endproclaim \proof Suppose $p_n > 0$ for $n = 1, \dots, N < \infty$. Let ${\cal K}_1$ be the Hilbert space spanned by the vectors $$\{P_i \varphi_n : i = 1, \dots, I; n = 1, \dots, N\}.$$ ${\cal K}_1$ is finite-dimensional. Let $P'_i$ be the restriction of $P_i$ to ${\cal K}_1$. Define ${\cal K}_2$ and $(Q'_j)$ similarly. Then $\Psi = \sum_n \sqrt{p_n} \varphi_n \otimes \psi_n = \sum_{i j n} \sqrt{p_n} P_i\varphi_n \otimes Q_j \psi_n \in {\cal K}_1 \otimes {\cal K}_2$ and the finite-dimensional result can be applied, giving $$\displaylines{ - \sum_n p_n \log p_n \ge \sum_{i, j} \sigma(P'_i Q'_j) \log \sigma(P'_i Q'_j)/(\sigma_1(P'_i)\sigma_2(Q'_j)) \hcrh = \sum_{i, j} \sigma(P_i Q_j) \log \sigma(P_i Q_j)/(\sigma_1(P_i)\sigma_2(Q_j)). \hfill \blacksquare }$$ Now suppose that the sequences $(P_i)_{i=1}^I$ and $(Q_j)_{j=1}^J$ remain finite, but consider general $\Psi = \sum_n \sqrt{p_n} \varphi_n \psi_n$. Let $R^N$ be the projection on $\calH$ onto the finite-dimensional space spanned by $\{\varphi_m \psi_n: m = 1,\dots, N; n = 1, \dots, N\}$. Assume, without loss of generality, that $p_1 > 0$ and set $r^N_n = p_n /\sum_{n=1}^N p_n$ and $\sigma^N = |\Psi^N\>\<\Psi^N|$ where $$\Psi^N = \sum_{n=1}^N \sqrt{r^N_n} \varphi_n \psi_n.$$ The lemma shows that $$- \sum_{n=1}^N r^N_n \log r^N_n \ge \sum_{i, j} \sigma^N(P_i Q_j) \log \sigma^N(P_i Q_j)/(\sigma^N_1(P_i)\sigma^N_2(Q_j)).$$ This is equivalent to $$\sum_{n=1}^N r^N_n \log r^N_n \le \ent{\calZ}{\sigma^N|_{\calZ}}{\sigma^N_1|_{\calZ_1} \otimes \sigma^N_2|_{\calZ_2}}.$$ W*-upper semicontinuity of relative entropy (\link{ref}{Donald 1986}), or, in this abelian situation, Fatou's lemma, then implies that $$\displaylines{ \ent{\calZ}{\sigma|_{\calZ}}{\sigma_1|_{\calZ_1} \otimes \sigma_2|_{\calZ_2}} \ge \limsup_{N \to \infty} \ent{\calZ}{\sigma^N|_{\calZ}}{\sigma^N_1|_{\calZ_1} \otimes \sigma^N_2|_{\calZ_2}} \hcrh \ge \limsup_{N \to \infty} \sum_{n=1}^N r^N_n \log r^N_n = \limsup_{N \to \infty} ({\sum_{n=1}^N p_n \log p_n \over \sum_{n=1}^N p_n} - \log (\sum_{n=1}^N p_n)) = \sum_{n=1}^{\infty} p_n \log p_n.$$ }$$ \medskip Finally, for the case of infinite sequences $(P_i)_{i=1}^\infty$ and $(Q_j)_{j=1}^\infty$, write $P^I = \sum_{i = I+1}^\infty P_i$, $Q^J = \sum_{j = I+1}^\infty Q_j$. Then, we have just proved that, for all finite $I$ and $J$, $$\displaylines{ - \sum_{n=1}^{\infty} p_n \log p_n \ge \sum_{i =1}^I \sum_{j=1}^J \big[\sigma(P_i Q_j) \log \sigma(P_i Q_j) - \sigma(P_i Q_j) \log (\sigma_1(P_i)\sigma_2(Q_j)) \hcrh - \sigma(P_i Q_j) + \sigma_1(P_i)\sigma_2(Q_j)\big] \cr + [\sigma(P^I Q^J) \log \sigma(P^I Q^J) - \sigma(P^I Q^J) \log (\sigma_1(P^I)\sigma_2(Q^J)) - \sigma(P^I Q^J) + \sigma_1(P^I)\sigma_2(Q^J)\big] \cr \ge \sum_{i =1}^I \sum_{j=1}^J \big[\sigma(P_i Q_j) \log \sigma(P_i Q_j) - \sigma(P_i Q_j) \log (\sigma_1(P_i)\sigma_2(Q_j)) \hcrh - \sigma(P_i Q_j) + \sigma_1(P_i)\sigma_2(Q_j)\big]. }$$ Taking the limits $I \to \infty$, $J \to \infty$ gives the required bound and completes the proof of the theorem. \hfill \blacksquare \proclaim{References.}{} \endproclaim \frenchspacing \parindent=0pt {\everypar={\hangindent=0.75cm \hangafter=1} \name{ref} Cover, T.M. and Thomas, J.A. (1991) {\sl Elements of Information Theory.} (Wiley) DeWitt, B.S. and Graham, N. (1973) {\sl The Many-Worlds Interpretation of Quantum Mechanics.} (Princeton) Donald, M.J. (1986) ``On the relative entropy.'' {\sl Commun. Math. Phys.} {\bf 105}, 13--34. Donald, M.J. (1992) ``A priori probability and localized observers.'' {\sl Foundations of Physics, \bf 22}, 1111--1172. Everett, H., III (1957) ``The theory of the universal wave function.'' Pages 1--140 of DeWitt and Graham (1973). Lanford, O.E., III and Robinson, D.W. (1968) ``Mean entropy of states in quantum-statistical mechanics.'' {\sl J. Math. Phys.} {\bf 9}, 1120--1125. Lieb, E.H. and Ruskai, M.B. (1973) ``Proof of the strong subadditivity of quantum-mechanical entropy.'' {\sl J. Math. Phys.} {\bf 14}, 1938--1941. Lindblad, G. (1974) ``Expectations and entropy inequalities for finite quantum systems.'' {\sl Commun. Math. Phys.} {\bf 39}, 111--119. Nielsen, M.A. and Chuang, I.L. (2000) {\sl Quantum Computation and Quantum Information.} (Cambridge) Ohya, M. and Petz, D. (1993) {\sl Quantum Entropy and Its Use.} (Springer-Verlag) Schumacher, B. and Nielsen M.A. (1996) ``Quantum data processing and error correction.'' {\sl Phys. Rev. A}{\bf 54}, 2629--2635. {\sl \outlink{http://arXiv.org/abs/quant-ph/9604022}{quant-ph/9604022}} Uhlmann, A. (1977) ``Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory.'' {\sl Commun. Math. Phys.} {\bf 54}, 21--32. von Neumann, J. (1932) {\sl Mathematische Grundlagen der Quantenmechanik.} (Sprin\-ger). Everett cites the 1955 translation {\sl Mathematical Foundations of Quantum Mechanics.} (Princeton) } \vfill \hfill August 2010 \end