Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Warning: Undefined array key "clientWidth" in /var/www/html/w/includes/Media/SvgHandler.php on line 447

Warning: Undefined array key "clientHeight" in /var/www/html/w/includes/Media/SvgHandler.php on line 448

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 68

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 69

Warning: Undefined array key "clientWidth" in /var/www/html/w/includes/Media/SvgHandler.php on line 447

Warning: Undefined array key "clientHeight" in /var/www/html/w/includes/Media/SvgHandler.php on line 448

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 68

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 69

Warning: Undefined array key "clientWidth" in /var/www/html/w/includes/Media/SvgHandler.php on line 447

Warning: Undefined array key "clientHeight" in /var/www/html/w/includes/Media/SvgHandler.php on line 448

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 68

Deprecated: round(): Passing null to parameter #1 ($num) of type int|float is deprecated in /var/www/html/w/includes/Media/ThumbnailImage.php on line 69
Reward Collapse in Aligning Large Language Models - MaRDI portal

Reward Collapse in Aligning Large Language Models (Q6438249)

From MaRDI portal

Jump to:navigation, search

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use this page instead for the normal view: Reward Collapse in Aligning Large Language Models

preprint article from arXiv

Language	Label	Description	Also known as
English	Reward Collapse in Aligning Large Language Models	preprint article from arXiv

Statements

scholarly article

0 references

publication date

27 May 2023

0 references

arXiv classification

cs.LG

0 references

cs.AI

0 references

cs.CL

0 references

math.OC

0 references

stat.ML

0 references

author name string

Ziang Song

0 references

Tianle Cai

0 references

Jason D. Lee

0 references

Weijie J. Su

0 references

MaRDI profile type

0 references

has companion code repository

https://github.com/ctlllll/reward_collapse

1 reference

PapersWithCode reference URL

https://paperswithcode.com/paper/reward-collapse-in-aligning-large-language

publication

Identifiers

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:6438249

Retrieved from "https://mardi.schubotz.org/w/index.php?title=Item:Q6438249&oldid=40688052"