RLAIF
English
Noun
RLAIF (uncountable)
- (machine learning) Initialism of reinforcement learning from AI feedback.
- 2023, “RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback”, in Arxiv:
- Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences. However, gathering high-quality human preference labels can be a time-consuming and expensive endeavor. RL from AI Feedback (RLAIF), introduced by Bai et al., offers a promising alternative that leverages a powerful off-the-shelf LLM to generate preferences in lieu of human annotators.
- 2023 October 6, Tasmia Ansari, “Reinforcement Learning Craves Less Human, More AI”, in Analytics India Magazine:
- a prime hurdle lies in gathering high-quality human preference labels. This is where reinforcement learning from human feedback with AI feedback (RLAIF) comes into the picture, a novel framework by Google Research to train models with reduced reliance on human intervention.
See also
This article is issued from Wiktionary. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.