Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

project page | paper

PyTorch implementation of the paper "Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks", ICML 2024.

Bias of Stochastic Gradient Descent or the Architecture | Amit Peleg, Matthias Hein | ICML, 2024

Our implementation is based on the paper "Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent" (ICLR 2023) and their github repository.

Installation

Clone this repository

git clone https://github.com/AmitPeleg/generalization
cd generalization/

To create a conda environment:

conda env create -f environment.yml
conda activate generalization

Configs

The different configurations to reproduce the figures in the paper are in the configs folder. The configuration files are in yaml format. The configuration files are named according to the figure they reproduce. For example, the configs in configs/fig1/ reproduce Figure 1 in the paper.

Experiments

Single experiment

To reproduce a single experiment in the paper, run the following,

python train.py --config configs/path/to/config.yaml

The results for this config will be saved in the output folder.

The train.py script always runs on a single GPU (cuda:0). You can choose the GPU by setting the environment variable CUDA_VISIBLE_DEVICES. It is also possible to run the same config on multiple GPUs by launching the script multiple times with different CUDA_VISIBLE_DEVICES. Running on multiple GPUs will produce the same results as on a single GPU.

Entire figure

To reproduce an entire figure, run the following,

python run_exps.py <fig_name>

A list of figure names can be found in the configs folder.

The outputs/errors of each experiment are saved in the logs folder, and the results are stored in the output folder.

Each config runs on a single GPU. The run_exps.py script launches the scripts of a specific figure using the available GPUs. If you don't want to use all GPUs, do not set the CUDA_VISIBLE_DEVICES environment variable, but rather set GPU_LIST in the run_exps.py script (line 73). To better utilize each GPU, you can adjust the batch size in the config files.

Provided results

We provide the results for Figure 1-5 for the mnist dataset and lenet architecture. The results should be downloaded and extracted to the output folder. They can also be downloaded directly,

mkdir output
cd output
wget https://nc.mlcloud.uni-tuebingen.de/index.php/s/fzc7xj8mp6YDc4Q/download/fig_1_to_5_lenet_mnist.zip
unzip fig_1_to_5_lenet_mnist.zip

Creating figures

After running the experiment, to plot the figure, run the following,

python -m create_figs.run_figs <fig_name>

A list of figure names can be found using python -m create_figs.run_figs --help. There may be slight differences between the provided results and the paper due to code reorganization and changes in the randomization seed.

Citation

@inproceedings{Peleg2024ICML_Bias_of_Stochastic,
  author={Amit Peleg and Matthias Hein},
  title={Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks},
  booktitle={ICML},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
configs		configs
create_figs		create_figs
utils		utils
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
run_exps.py		run_exps.py
settings.py		settings.py
train.py		train.py
train_args.py		train_args.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

project page | paper

Installation

Configs

Experiments

Single experiment

Entire figure

Provided results

Creating figures

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

project page | paper

Installation

Configs

Experiments

Single experiment

Entire figure

Provided results

Creating figures

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages