Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

estimate_cov for subset and subset infinity sometimes returns complex number #138

Closed
AnderGray opened this issue Oct 26, 2023 · 3 comments · Fixed by #142
Closed

estimate_cov for subset and subset infinity sometimes returns complex number #138

AnderGray opened this issue Oct 26, 2023 · 3 comments · Fixed by #142
Assignees
Labels
bug Something isn't working

Comments

@AnderGray
Copy link
Collaborator

AnderGray commented Oct 26, 2023

Performing convergence study for MC, subset, and subset infinity.

Sometimes (randomly) estimate_cov is trying to take the sqrt of a complex number. Script for the study

using UncertaintyQuantification, DelimitedFiles, Formatting
using Distributions, Plots

###
#   Define input random variables
###
X1 = RandomVariable(Normal(0, 1), :X1)
X2 = RandomVariable(Normal(0, 1), :X2)

inputs = [X1, X2]

###
#   Define function, and model
###
function y(df)
    return df.X1 .+ df.X2
end

m = Model(y, :y)

###
#   Define limitstate. Failure is limitstate <= 0
###
function limitstate(df)
    return 4 .- reduce(vcat, df.y)
end

###
#   True failure for this problem
###
True_failure_prob = 1 - cdf(Normal(), 4/sqrt(2))

N_samples = 10 .^range(2, 5, 50)

pf_1 = zeros(length(N_samples))
pf_2 = zeros(length(N_samples))
pf_3 = zeros(length(N_samples))

cov_1 = zeros(length(N_samples))
cov_2 = zeros(length(N_samples))
cov_3 = zeros(length(N_samples))


for (i, N) in enumerate(N_samples)
    println(i)
    ###
    #   Try different simulation methods
    ###
    simulation_method_1 = MonteCarlo(Integer(round(N * 3)))
    simulation_method_2 = SubSetSimulation(Integer(round(N)), 0.1, 10, Normal())
    simulation_method_3 = SubSetInfinity(Integer(round(N)), 0.1, 10, 0.5)

    ###
    #   Simulate
    ###
    pf_1[i], cov_1[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_1)
    pf_2[i], cov_2[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_2)
    pf_3[i], cov_3[i], _ = probability_of_failure(m, limitstate, inputs, simulation_method_3)

end



plot(N_samples, pf_1)
plot!(N_samples, pf_2)
plot!(N_samples, pf_3)
plot!(N_samples, ones(length(N_samples)) * True_failure_prob,  xaxis=:log10)


plot(N_samples, cov_1)
plot!(N_samples, cov_2)
plot!(N_samples, cov_3, xaxis=:log10)
ERROR: LoadError: DomainError with -0.0004979094436062081:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
 [1] throw_complex_domainerror(f::Symbol, x::Float64)
   @ Base.Math ./math.jl:33
 [2] sqrt
   @ ./math.jl:677 [inlined]
 [3] estimate_cov(Iᵢ::BitMatrix, pf::Float64, n::Int64)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:293
 [4] probability_of_failure(models::Model, performancefunction::typeof(limitstate), inputs::Vector{RandomVariable}, sim::SubSetInfinity)
   @ UncertaintyQuantification ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/src/simulations/subset.jl:130
 [5] top-level scope
   @ ~/Documents/julia/UQ_Fork/UncertaintyQuantification.jl/demo/Damask/example_convergance/example_convergance.jl:58
 [6] include(fname::String)
   @ Base.MainInclude ./client.jl:478
 [7] top-level scope
   @ REPL[19]:1
in expression starting at
@AnderGray AnderGray added the bug Something isn't working label Oct 26, 2023
@FriesischScott FriesischScott self-assigned this Oct 26, 2023
@FriesischScott
Copy link
Owner

I'll investigate.

@FriesischScott
Copy link
Owner

Next time please just paste the code here instead of uploading a zip file.

@FriesischScott
Copy link
Owner

Turns out the issue here is using a sample size where N*target leads to less samples in the levels >=2. For example using 205 and 0.1 leads to 189 samples in the subsets because of the way the number of chains and samples per chain are calculated

number_of_seeds = Int64(max(1, ceil(sim.n * sim.target)))
samples_per_seed = Int64(floor(sim.n / number_of_seeds))

here 21 seeds and 9 chains = 189 != 205.

However the sample size passed to the cov estimation is always n.

This is easily fixed by computing n for the estimation from the available samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants