Skip to content

Commit

Permalink
feat-dpo-words-count-1.0 (#598)
Browse files Browse the repository at this point in the history
* feat-dpo-words-count-1.0

字数控制训练样本间cookbook,包括微调,评估

* Update main.ipynb

* feat_dpo_words_count

实现字数控制dpo微调任务,完成相应实验以及验证

* Update main.ipynb

(修改了代码bug)
sft+dpo训练,验证

* Update main.ipynb

* Update main.ipynb

* Update load_statistics.py

* Update main.ipynb

* fix

* 修改eval逻辑

* Update main.ipynb

* Update main.ipynb

* Update load_statistics.py

---------

Co-authored-by: zhonghanjun <[email protected]>
  • Loading branch information
Alex-TG001 and danielhjz authored Jun 21, 2024
1 parent bf6389b commit 64a1e9e
Show file tree
Hide file tree
Showing 2 changed files with 760 additions and 0 deletions.
39 changes: 39 additions & 0 deletions cookbook/awesome_demo/dpo_words_count_control/eval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
from qianfan.model import Model
from qianfan.common import Prompt
from qianfan import Completion
import re
from qianfan.dataset import Dataset

def eval(version_id, ds):
result_ds = ds.test_using_llm(model_version_id=version_id)
res = []
for i in result_ds:

# 提取字数限制
char_limit_match = re.search(r"(\d+)字", i['prompt'])
if not char_limit_match:
raise ValueError("输入中未找到字数限制")

limit = int(char_limit_match.group(1))
fact = len(i['llm_output'])

# 计算比例
ratio = fact / limit
abs_diff = abs(ratio - 1)

# 根据绝对值的大小返回相应的得分
if abs_diff <= 0.05:
score = 1
elif abs_diff <= 0.10:
score = 0.75
elif abs_diff <= 0.15:
score = 0.5
elif abs_diff <= 0.20:
score = 0.25
elif abs_diff <= 0.25:
score = 0.1
else:
score = 0
res.append(score)
avg = sum(res) / len(res)
return avg
Loading

0 comments on commit 64a1e9e

Please sign in to comment.