To create the conda environment 'HELP', run the following command:
conda env create -f environment.yml
Then activate the environment using:
conda activate HELP
Make a copy of the .env.template file and rename it to .env. Fill in the values for the environment variables in the .env file.
The datasets are publicly available via Zenodo.
Please download the datasets and place them in the full_dataset/ directory following the format of Apache/.
To get the performance of HELP on the benchmark datasets, run the following sets of commands:
# for precomputing the openai embeddings
python CacheOpenAI.py
# for creating cross validation training data
python CustomEmbeddingsPreprocess.py
# For training and testing the model
python HyperparamScan.py
To reproduce the UMap plots, run:
# for plotting nn plot
python UMapPlot.py --nn
# for plotting non nn plot
python UMapPlot.py
To recreate the ablation results of the paper on HELP components: HELP w/o LLM:
python RebalanceMergeNN.py
HELP w/o prev. & NN:
python Rebalance_Merge.py
HELP w/o prev. & WC:
python CacheOpenAI.py --no_wc
python Rebalance_Merge.py --no_wc
HELP w/o prev. & IRM:
python NaiveFixed.py