Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations (Q6696307)

Dataset published at Zenodo repository.

Language	Label	Description	Also known as
English	Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations	Dataset published at Zenodo repository.

Statements

instance of

data set

0 references

description

Description Prompts generated from ChatGPT3.5, ChatGPT4, Llama3-8B, and Mistral-7B withNYT and HC3 topics in different roles and parameter configurations. The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations. The 0_Base_Topics.xlsx file lists the topics used for the dataset generation The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context: Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. Roles (context) Default: No role is assigned to the LLM, the default role is used. Child: The LLM is requested to answer as a five-year-old child. Young adult male: The LLM is requested to answer as a young male adult. Young adult female: The LLM is requested to answer as a young female adult. Elderly adult male: The LLM is requested to answer as an elderly male adult. Elderly adult female: The LLM is requested to answer as an elderly female adult. Affluent adult male: The LLM is requested to answer as an affluent male adult. Affluent adult female: The LLM is requested to answer as an affluent female adult. Lower-class adult male: The LLM is requested to answer as a lower-class male adult. Lower-class adult female: The LLM is requested to answer as a lower-class female adult. Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary. Paper Paper: Beware of Words: Evaluating the Lexical Diversity ofConversational LLMs using ChatGPT as Case Study Cite: @article{10.1145/3696459,author = {Mart\'{\i}nez, Gonzalo and Hern\'{a}ndez, Jos\'{e} Alberto and Conde, Javier and Reviriego, Pedro and Merino-G\'{o}mez, Elena},title = {Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study},year = {2024},publisher = {Association for Computing Machinery},address = {New York, NY, USA},issn = {2157-6904},url = {https://doi.org/10.1145/3696459},doi = {10.1145/3696459},abstract = ,note = {Just Accepted},journal = {ACM Trans. Intell. Syst. Technol.},month = sep,keywords = {LLM, Lexical diversity, ChatGPT, Evaluation}}

0 references

publication date

4 May 2024

0 references

author

Martínez Gonzalo

0 references

José Alberto Hernández

0 references

Conde Javier

0 references

Reviriego Pedro