compute hessian matrix of circuit
yuyuexi opened this issue · comments
when I used Zygote.hessian to compute the hessian matrix of a circuit, i.e Zygote.hessian(f, params), where params are parameters of a variational circuit and f(params) is a real number, I got a MethodError whose message is below.
It seems that when ForwardDiff use Dual to compute the jacobian matrix, the Dual type is not supported by RotationGate in this circuit. So how can I compute hessian of my real-value function f?
(just for example: f(params) = operator_fidelity(target_unitary, dispatch!(circuit, params)) )
Thanks for the issue.
Yao is not compatible with Zygote. You may want to combine Yao's built-in AD engine and Dual numbers to obtain hessians.
julia> using Yao
julia> using ForwardDiff: Dual
julia> reg = ArrayReg(Complex.(Dual.(randn(128), zeros(128)), Dual.(randn(128), zeros(128))))
ArrayReg{1, Complex{Dual{Nothing,Float64,1}}, Array...}
active qubits: 7/7
julia> c = put(7, 2=>Rx(Dual(2.1, 1.0)))
nqubits: 7
put on (2)
└─ rot(X, Dual{Nothing}(2.1,1.0))
julia> expect'(put(7, 2=>Z), reg=>c)
ArrayReg{1, Complex{Dual{Nothing,Float64,1}}, Array...}
active qubits: 7/7 => Dual{Nothing,Float64,1}[Dual{Nothing}(38.65132803757106,-11.504413665590928)]
Ah, the operator fidelity does not work. But it will be fixed in this PR: QuantumBFS/YaoBlocks.jl#150
Please try
pkg> add YaoBlocks#master
Thanks for the issue.
Yao is not compatible with Zygote. You may want to combine Yao's built-in AD engine and Dual numbers to obtain hessians.julia> using Yao julia> using ForwardDiff: Dual julia> reg = ArrayReg(Complex.(Dual.(randn(128), zeros(128)), Dual.(randn(128), zeros(128)))) ArrayReg{1, Complex{Dual{Nothing,Float64,1}}, Array...} active qubits: 7/7 julia> c = put(7, 2=>Rx(Dual(2.1, 1.0))) nqubits: 7 put on (2) └─ rot(X, Dual{Nothing}(2.1,1.0)) julia> expect'(put(7, 2=>Z), reg=>c) ArrayReg{1, Complex{Dual{Nothing,Float64,1}}, Array...} active qubits: 7/7 => Dual{Nothing,Float64,1}[Dual{Nothing}(38.65132803757106,-11.504413665590928)]
thanks for the timely reply, I tried your sample code and it runs well.
However, I encountered some problems and still cannot figure out.
- firstly, I run circuit with dual parameters and normal parameters perspectively and found their outputs are not the same. More precisely, normal circuit expectation value is just the half of dual circuit expectation. I supposed they should be exactly same except the partial field?
using Yao
using ForwardDiff: Dual
s = randn(4)
reg = ArrayReg(Complex.(Dual.(s, zeros(4)), Dual.(s, zeros(4))))
c = chain(2, put(2=>Rx(Dual(2.1, 1.0))))
println(expect(chain(2, put(2=>Z)), reg=>c))
println(expect'(chain(2, put(2=>Z)), reg=>c))
Dual{Nothing}(-2.9223602277359566,-4.996787532515955) + Dual{Nothing}(0.0,0.0)*im
ArrayReg{1, Complex{Dual{Nothing,Float64,1}}, Array...}
active qubits: 2/2 => Dual{Nothing,Float64,1}[Dual{Nothing}(-4.996787532515955,2.922360227735957)]
reg = ArrayReg(Complex.(s, zeros(4)))
c = chain(2, put(2=>Rx(2.1)))
println(expect(chain(2, put(2=>Z)), reg=>c))
println(expect'(chain(2, put(2=>Z)), reg=>c))
-1.4611801138679787 + 0.0im
ArrayReg{1, Complex{Float64}, Array...}
active qubits: 2/2 => [-2.4983937662579776]
- secondly, when I replaced
Rx
gate byRy
gate, a StackOverflowError has been raised. I expected that this change would not influence anything important? or something I did not interpret correctly?
using Yao
using ForwardDiff: Dual
s = randn(4)
reg = ArrayReg(Complex.(Dual.(s, zeros(4)), Dual.(s, zeros(4))))
c = chain(2, put(2=>Ry(Dual(2.1, 1.0))))
println(expect(chain(2, put(2=>Z)), reg=>c))
println(expect'(chain(2, put(1=>Z), put(2=>Z)), reg=>c))
Dual{Nothing}(-5.727229682144747,14.589119935053684) + Dual{Nothing}(0.0,0.0)*im
StackOverflowError:
Stacktrace:
[1] _cpow(::Complex{Dual{Nothing,Float64,1}}, ::Complex{Dual{Nothing,Float64,1}}) at ./complex.jl:780 (repeats 51934 times)
[2] ^(::Complex{Dual{Nothing,Float64,1}}, ::Complex{Dual{Nothing,Float64,1}}) at ./complex.jl:781
[3] ^(::Complex{Dual{Nothing,Float64,1}}, ::Complex{Int64}) at ./promotion.jl:343
[4] ^(::Complex{Dual{Nothing,Float64,1}}, ::Int64) at ./complex.jl:786
...
(message omitted)
- thirdly, I did not dive very deep into package Yao, so naively I expected this code would return gradient respect to each parameter when there are more than one parameters, which is not the result I found. so is there anything I should notice?
using Yao
using ForwardDiff: Dual
s = randn(4)
reg = ArrayReg(Complex.(Dual.(s, zeros(4)), Dual.(s, zeros(4))))
c = chain(2, put(1=>Rx(Dual(2.1, 1.0))), put(2=>Rx(Dual(2.1, 1.0))))
println(expect(chain(2, put(1=>Z), put(2=>Z)), reg=>c))
Dual{Nothing}(-0.11228066743748855,2.4246744104155162) + Dual{Nothing}(0.0,0.0)*im
again, thanks for your timely reply and hope for next helpful advice.
Ah, I should mention the higher level API of ForwardDiff, here is an example of computing the hessian
using ForwardDiff: jacobian, Dual
using Yao
using LinearAlgebra: I
function Base.:(^)(x::Complex{<:Dual}, n::Int)
y = one(x)
for i=1:n
y*=x
end
y
end
function compute_gradient(params::AbstractVector{T}) where T
target = matblock(Matrix{Complex{T}}(I, 1<<5, 1<<5))
c = chain(5,
put(5, 2=>Rx(params[1])),
put(5, 1=>Ry(params[2])),
put(5, 3=>Rz(params[3])),
put(5, 2=>shift(params[4]))
)
operator_fidelity'(target, c)[2]
end
x = rand(4)*2π
g = compute_gradient(x)
h = jacobian(compute_gradient, x)
The jacobian
is a function to compute the jacobian matrix using ForwardDiff.
The functions with prime ('
) compute parameters gradients. Where parameters in a circuit can be obtained with parameters(c)
.
About you questions
-
expect
returns the expectation value,expect'
(with prime) returns a pair of gradients (d[expectation value]/d[register] => d[expectation value]/d[circuit parameters]). So they are very different. -
Nice catch. It should be a bug of ForwardDiff, in the above example, we overwrite the pow function in base in order to make it work. I filed an issue here: JuliaDiff/ForwardDiff.jl#486
-
This is in fact a question about Dual numbers, it computes d[multiple output]/d[single input], rather than returning gradients. To obtain the hessian, you need to enumerate over inputs, or simply using the above jacobian function (recommended). FYI: check this arxiv paper: https://arxiv.org/abs/1607.07892
Ah, I should mention the higher level API of ForwardDiff, here is an example of computing the hessian
using ForwardDiff: jacobian, Dual using Yao using LinearAlgebra: I function Base.:(^)(x::Complex{<:Dual}, n::Int) y = one(x) for i=1:n y*=x end y end function compute_gradient(params::AbstractVector{T}) where T target = matblock(Matrix{Complex{T}}(I, 1<<5, 1<<5)) c = chain(5, put(5, 2=>Rx(params[1])), put(5, 1=>Ry(params[2])), put(5, 3=>Rz(params[3])), put(5, 2=>shift(params[4])) ) operator_fidelity'(target, c)[2] end x = rand(4)*2π g = compute_gradient(x) h = jacobian(compute_gradient, x)The
jacobian
is a function to compute the jacobian matrix using ForwardDiff.
The functions with prime ('
) compute parameters gradients. Where parameters in a circuit can be obtained withparameters(c)
.About you questions
expect
returns the expectation value,expect'
(with prime) returns a pair of gradients (d[expectation value]/d[register] => d[expectation value]/d[circuit parameters]). So they are very different.- Nice catch. It should be a bug of ForwardDiff, in the above example, we overwrite the pow function in base in order to make it work. I filed an issue here: JuliaDiff/ForwardDiff.jl#486
- This is in fact a question about Dual numbers, it computes d[multiple output]/d[single input], rather than returning gradients. To obtain the hessian, you need to enumerate over inputs, or simply using the above jacobian function (recommended). FYI: check this arxiv paper: https://arxiv.org/abs/1607.07892
thanks for your careful explanation. I am afraid of that I might not figure everything out yet.
-
firstly, I understand that 1) function with prime means its differentiation and 2) Dual number has two fields, i.e. value and partials (as mentioned in arxiv paper you showed ). So as for
expect
andexpect'
(with prime) I mentioned before, from my comprehension, the first field of the output ofexpect
with Dual input is the expectation of this circuit which should be the same as the output ofexpect
with normal complex input, and the second field is the differentiation which should be the same as the output ofexpect'
(with prime) with normal complex input. but actually, as showed in my last reply, I found the value field of the output ofexpect
with dual input is the double ofexpect
with normal complex input and similar for the partial field. -
thanks for that issue and I will follow that.
-
I run your sample code, but it gives me some error message which looks like I use
Zygote.hessian
to compute the hessian of a circuit.
using ForwardDiff: jacobian, Dual
using Yao
using LinearAlgebra: I
function Base.:(^)(x::Complex{<:Dual}, n::Int)
y = one(x)
for i=1:n
y*=x
end
y
end
function compute_gradient(params::AbstractVector{T}) where T
target = matblock(Matrix{Complex{T}}(I, 1<<5, 1<<5))
c = chain(5,
put(5, 2=>Rx(params[1])),
put(5, 1=>Ry(params[2])),
put(5, 3=>Rz(params[3])),
put(5, 2=>shift(params[4]))
)
operator_fidelity'(target, c)[2]
end
x = rand(4)*2π
g = compute_gradient(x)
h = jacobian(compute_gradient, x)
MethodError: no method matching Float64(::Dual{ForwardDiff.Tag{typeof(compute_gradient),Float64},Float64,4})
Closest candidates are:
Float64(::Real, !Matched::RoundingMode) where T<:AbstractFloat at rounding.jl:200
Float64(::T) where T<:Number at boot.jl:715
Float64(!Matched::Int8) at float.jl:60
...
Stacktrace:
[1] convert(::Type{Float64}, ::Dual{ForwardDiff.Tag{typeof(compute_gradient),Float64},Float64,4}) at ./number.jl:7
[2] Complex{Float64}(::Dual{ForwardDiff.Tag{typeof(compute_gradient),Float64},Float64,4}, ::Int64) at ./complex.jl:12
[3] Complex{Float64}(::Dual{ForwardDiff.Tag{typeof(compute_gradient),Float64},Float64,4}) at ./complex.jl:35
[4] convert(::Type{Complex{Float64}}, ::Dual{ForwardDiff.Tag{typeof(compute_gradient),Float64},Float64,4}) at ./number.jl:7
[5] setindex! at ./array.jl:828 [inlined]
[6] hvcat_fill at ./abstractarray.jl:1707 [inlined]
...
(message omitted)
thanks again for your kind reply!
-
It is true that the gradient obtained in Yao and ForwardDiff are different by a factor of 2. This is because they are following different convensions for complex valued gradients. Yao only differentiate either ket or bra. The overall factor is not important in gradient based training.
-
You need to show this part :D
(message omitted)
BTW: you need to use the master branch of YaoBlocks, otherwise you will see the above error.
- It is true that the gradient obtained in Yao and ForwardDiff are different by a factor of 2. This is because they are following different convensions for complex valued gradients. Yao only differentiate either ket or bra. The overall factor is not important in gradient based training.
- You need to show this part :D
(message omitted)
BTW: you need to use the master branch of YaoBlocks, otherwise you will see the above error.
Ah, sorry for forgetting to update. Now sample code works well for me.
Thanks again. I believe this actually solves my problem. Means a lot!
- It is true that the gradient obtained in Yao and ForwardDiff are different by a factor of 2. This is because they are following different convensions for complex valued gradients. Yao only differentiate either ket or bra. The overall factor is not important in gradient based training.
- You need to show this part :D
(message omitted)
BTW: you need to use the master branch of YaoBlocks, otherwise you will see the above error.
Sorry to bother again. Actually I can accomplish my project via discussion above. However, I found a subtle problem based on your sample code and I think this might be some features of Yao package. In order to figure out all details I decide to bother you again.
I noticed that, in your sample code, circuit is defined in function compute_gradient
and it works well. For a more general usage, I modified this code and define circuit out of this function. Then I got a MethodError
just like I use Zygote.hessian
to compute hessian before. Is there anything I did not understand correctly?
using ForwardDiff: jacobian, Dual
using Yao
using LinearAlgebra: I
function Base.:(^)(x::Complex{<:Dual}, n::Int)
y = one(x)
for i=1:n
y*=x
end
y
end
function f1(params::AbstractVector{T}) where T
target = matblock(Matrix{Complex{T}}(I, 1<<5, 1<<5))
c = chain(5,
control(5, 1, 2=>Rx(params[1])),
put(5, 1=>Ry(params[2])),
put(5, 3=>Rz(params[3])),
put(5, 2=>shift(params[4]))
)
circ = dispatch!(c, params) # this is used for consistence
-operator_fidelity'(target, circ)[2]
end
function f2(params::AbstractVector{T}) where T
target = matblock(Matrix{Complex{T}}(I, 1<<5, 1<<5))
circ = dispatch!(c1, params)
-operator_fidelity'(target, circ)[2]
end
x = rand(4)*2π
c1 = chain(5,
control(5, 1, 2=>Rx(x[1])),
put(5, 1=>Ry(x[2])),
put(5, 3=>Rz(x[3])),
put(5, 2=>shift(x[4]))
)
println("diff of grad: ")
println(f1(x) - f2(x))
println("jacobian of f1: ")
println(jacobian(f1, x))
println("jacobian of f2: ")
println(jacobian(f2, x))
diff of grad:
[0.0, 0.0, 0.0, 0.0]
jacobian of f1:
[0.0010114056391029905 -0.008235282098312148 -0.0002049829250582547 -0.002800577794016422; -0.00823528209831215 0.002129952406713545 -0.015755781525783795 -0.21526325598068313; -0.0002049829250582547 -0.015755781525783802 0.002129952406713597 -0.0053580789755250675; -0.002800577794016422 -0.21526325598068316 -0.005358078975525069 0.002129952406713596]
jacobian of f2:
MethodError: no method matching Float64(::Dual{ForwardDiff.Tag{typeof(f2),Float64},Float64,4})
Closest candidates are:
Float64(::Real, !Matched::RoundingMode) where T<:AbstractFloat at rounding.jl:200
Float64(::T) where T<:Number at boot.jl:715
Float64(!Matched::Int8) at float.jl:60
...
Stacktrace:
[1] convert(::Type{Float64}, ::Dual{ForwardDiff.Tag{typeof(f2),Float64},Float64,4}) at ./number.jl:7
[2] setproperty!(::RotationGate{1,Float64,XGate}, ::Symbol, ::Dual{ForwardDiff.Tag{typeof(f2),Float64},Float64,4}) at ./Base.jl:34
...
(message omitted)
ForwardDiff can only handle generic code, because it tries to replace numbers with dual types for computing gradients. f2 is not generic.