Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Notice: Unexpected clearActionName after getActionName already called in /var/www/html/w/includes/Context/RequestContext.php on line 321
Dataset - What are the Machine Learning best practices reported by practitioners on Stack Exchange? - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Dataset - What are the Machine Learning best practices reported by practitioners on Stack Exchange?

From MaRDI portal
(Redirected from Dataset:6696302)



DOI10.5281/zenodo.8058979Zenodo8058979MaRDI QIDQ6696302

Dataset published at Zenodo repository.

Author name not available (Why is that?)

Publication date: 8 May 2023

Copyright license: No records found.



The data correspond to the posts (questions and answers) retrieved by querying for posts related to the tag machine learning and the phrase best practice(s). The data were used as the basis for a study currently under review on discussing machine learning best practices as discussed by practitioners in question-and-answer communities such as Stack Exchange. The information from each type of post (i.e., questions and answers) is presented in multiple formats (i.e., .txt, .csv, and .xlsx). Answers - Variables AID:Unique identification of the answer in the QA website. ParentId: Unique identification of the question associated with the answer in the QA website AcceptedAnswerId: In the case in which an answer is the most voted question associated with theParentId, and it is different from the accepted answer, a different identifier from theAIDis available. In the case in which the accepted question had ascorelower than 1, a -1 is assigned. ABody:HTML text of the answer. Score:Upvotes - downvotes of the answer. url_Answer:URL of the answer. The question URL can be from different websites. type:best or accepted. Accepted in the case that the information belongs to the accepted answer of theParentIdquestion and best in the case in which it is the most voted question of theParentIdquestion. Date:Creation date of the answer. Questions - Variables QID: Unique identification of the question in the QA website. AcceptedAnswerId: Unique identification of the accepted answer for a specific question in the QA website. In the case in which a question had a most-voted answer different from the accepted one, and the accepted one had a negative score, a -1 was assigned to theAcceptedAnswerId. BestAnswerId: Unique identification of the most voted answer for a specific question in the QA website. In the case in which the most voted and accepted questions were the same, then a -1 was assigned to theBestAnswerId. Qtitle: Title of the question. QBody: HTML text of the question. Score: Upvotes - downvotes of the questions. QTags: Tags that are associated with each question. url_question: URL of the question. The question URL can be from different websites. Date: Creation date of the question This dataset is a subset of the Stack Exchange dump of 03.2021 (https://archive.org/details/stackexchange_20210301) in which a series of filters were applied to obtain the data used in the study.






This page was built for dataset: Dataset - What are the Machine Learning best practices reported by practitioners on Stack Exchange?