18980 Graduate Research Project Constraint Satisfaction in LLMs using Dynamic Penalty Integration in DPO