API

DegreesOfFreedom.anim_plot — Method

anim_plot(βs, βlasso)

Compare the lasso solution from glmnet and the iterative ridge via an animated plot.

source

DegreesOfFreedom.calc_df_mars — Method

calc_df_mars(;n = 100, p = 10, N = 100, nk = 5, d = 1, penalty = d+1, tol = 1e-6, seedx = rand(UInt), seedy = rand(UInt))

Calculate the degrees of freedom of MARS, and extract the nominal degrees of freedom used in earth::earth.

source

DegreesOfFreedom.cvlasso_vs_iter_ridge — Function

cvlasso_vs_iter_ridge(n, p)

Compare lasso with LOOCV and iterative ridge.

source

DegreesOfFreedom.demo_lasso — Function

demo_lasso(n, p, p1)

Demo for Lasso fitted by iterative ridge regressions.

source

DegreesOfFreedom.demo_lasso_df — Function

demo_lasso_df(n, p)

Demo for degrees of freedom of lasso via the iterative ridge regression.

source

DegreesOfFreedom.df_regtree — Method

df_regtree(; ps = [1, 5, 10], maxd = 4)

Experiment for degrees of freedom for regression trees with number of features ps and maximum depth maxd.

source

DegreesOfFreedom.df_splines — Method

df_splines(; Js = [5, 10, 15], λs = [0.001, 0.01, 0.1], n = 20, nrep = 10, nMC = 100)

Calculate the empirical degrees of freedom of four splines:

cubic splines with number of basis functions Js
smoothing splines with tuning parameter λs
sample size n
number of repetition nrep
number of Monte Carlo samples nMC

source

DegreesOfFreedom.gen_data — Function

gen_data(n, p, p1)

Generate simulation data:

X of size nxp
β of size p, where only the first p1 elements are signal.
y of size p: y = Xβ + ε

source

DegreesOfFreedom.gen_data_mars — Function

gen_data_mars(N = 100, p = 2)

Generate N observations from the tensor-product example with p predictors in Section 9.4.2 of Hastie et al. (2009) (The ESL book).

source

DegreesOfFreedom.iter_ridge — Method

iter_ridge(X::AbstractMatrix, y::AbstractVector, λ::Real)

Conduct iterative ridge regression for y on X with smoothness penalty parameter λ.

source

DegreesOfFreedom.mars_experiment_df_vs_df — Method

mars_experiment_df_vs_df(; ps = [1, 10, 50], folder = "/tmp", maxnk = 100)

Compare the nominal df and actual df for MARS for different number of predictors ps.

source

DegreesOfFreedom.mars_experiment_mse — Method

mars_experiment_mse(; folder = "/tmp", ps = [2, 10, 20, 30, 40, 50, 60], with_cv = false)

Run MARS experiments with default MARS, and MARS with corrected penalty factor, and if with_cv, MARS with corrected penalty factor by 10-fold CV.

source

DegreesOfFreedom.mse_mars — Method

mse_mars(; d = 1, N = 200, p = 2, penalty = d+1, nk = 50, with_cv = false)

Calculate the proportion of MSE decrease of the default MARS and the MARS with corrected df. If with_cv, the corrected df by cross-validation is also considered.

N and p: the dimension for the data generating model

source

DegreesOfFreedom.run_experiment_lasso_vs_subset — Method

run_experiment_lasso_vs_subset()

Run the experiment for comparing the degrees of freedom of lasso and best subset, whose results are saved into the figure shown in the paper.

source

DegreesOfFreedom.run_experiment_splines — Method

run_experiment_splines()

Run the experiment of splines, whose results will be saved into a .sil file (can be later loaded via deserialize) and a .tex file (the table displayed in the paper).

run_experiment_splines(folder = "/home/weiya/Overleaf/paperDoF/res/df")

source

DegreesOfFreedom.run_experiment_tree — Method

run_experiment_tree()

Run the experiment for regression tree, whose results are saved into a .sil file (can be later loaded via deserialize) and a .tex file (the table in the paper).

source

DegreesOfFreedom.save_plots — Method

save_plots(ps; output)

Save multi-images into a pdf file, if output is unspecified (default), the resulting file is /tmp/all.pdf. See also: https://github.com/szcf-weiya/Xfunc.jl/blob/master/src/plot.jl

source

DegreesOfFreedom.vary_p — Function

vary_p(d = 1, ps = [2, 10, 20, 30, 40, 50, 60]; with_cv = false, nrep = 10, nMC = 100)

Run MARS experiments with different number of predictors ps for degree d.

nrep: number of replications
nMC: number of Monte Carlo samples for calculating df
with_cv: whether include the comparisons with the cross-validation method

source