ReLIFT, a training method that interleaves RL with online FT, achieving superior performance and efficiency compared to using RL or SFT alone.
马路
RoadQAQ
AI & ML interests
None yet
Recent Activity
updated
a collection
about 2 months ago
Data for Dataflex
updated
a dataset
about 2 months ago
OpenDCAI/dataflex_10w_cot_from_numinamath_coT
published
a dataset
about 2 months ago
OpenDCAI/dataflex_10w_cot_from_numinamath_coT