Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Notice: Unexpected clearActionName after getActionName already called in /var/www/html/w/includes/Context/RequestContext.php on line 321
Dataset for the Article "A Predictive Method to Improve the Effectiveness of Twitter Communication in a Cultural Heritage Scenario" - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of MediaWiki\Skin\BaseTemplate::getPersonalTools was deprecated in 1.46 Call $this->getSkin()->getPersonalToolsForMakeListItem instead (T422975). [Called from Skins\Chameleon\Components\NavbarHorizontal\PersonalTools::getHtml in /var/www/html/w/skins/chameleon/src/Components/NavbarHorizontal/PersonalTools.php at line 66] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Dataset for the Article "A Predictive Method to Improve the Effectiveness of Twitter Communication in a Cultural Heritage Scenario"

From MaRDI portal
(Redirected from Dataset:6701078)



DOI10.5281/zenodo.4782984Zenodo4782984MaRDI QIDQ6701078

Dataset published at Zenodo repository.

Author name not available (Why is that?)

Publication date: 24 May 2021

Copyright license: No records found.



This is thedatasetfor the articleA Predictive Method to Improve the Effectiveness of Twitter Communication in a Cultural Heritage Scenario. Abstract: Museums are embracing social technologies in the attempt to broaden their audience and to engage people. Although social communication seems an easy task, media managers know how hard it is to reach millions of people with a simple message. Indeed, millions of posts are competing every day to get visibility in terms of likes and shares and very little research focused on museums communication to identify best practices. In this paper, we focus on Twitter and we propose a novel method that exploits interpretable machine learning techniques to: (a) predict whether a tweet will likely be appreciated by Twitter users or not; (b) present simple suggestions that will help enhancing the message and increasing the probability of its success. Using a real-world dataset of around 40,000 tweets written by 23 world famous museums, we show that our proposed method allows identifying tweet features that are more likely to influence the tweet success. Code to run a selection ofexperimentsis available athttps://github.com/rmartoglia/predict-twitter-ch Dataset structure The dataset contains the dataset used in the experiments of the above research paper. Only the extracted features for the museum tweet threads (and not the message full text) are provided and needed for the analyses. We selected 23 well known world spread art museums and grouped them into five groups: G1 (museums with at least three million of followers); G2 (museums with more than one million of followers); G3 (museums with more than 400,000 followers); G4 (museums with more that 200,000 followers); G5 (Italian museums). From these museums, we analyzed ca. 40,000 tweets, with a number varying from 5k ca. to 11k ca. for each museum group, depending on the number of museums in each group. Content features: these are the features that can be drawn form the content of the tweet itself. We further divide such features in the following two categories: Countable: these features have a value ranging into different intervals. We take into consideration: thenumber of hashtags (i.e., words preceded by #) in the tweet, the number of URLs (i.e., links to external resources), the number of images (e.g., photos and graphical emoticons), the number of mentions (i.e., twitter accounts preceded by @), the length of the tweet; On-Off : these features have binary values in {0, 1}. We observe whether the tweet has exclamation marks, question marks, person names, place names, organization names, other names.Moreover, we also take into consideration the tweet topic density: assuming that the involved topics correspond to the hashtags mentioned in the text, we define a tweet as dense of topics if the number of hashtags it contains is greater than a given threshold, set to 5. Finally, we observe the tweet sentiment that might be present (positive or negative) or not (neutral). Context features: these features are not drawn form the content of the tweet itself and might give a larger picture of the context in which the tweet was sent. Namely, we take into consideration the part of the day in which the tweet was sent (morning, afternoon, evening and night respectively from 5:00am to11:59am, from 12:00pm to 5:59pm, from 6:00pm to 10:59pm and from 11pm to 4:59am), and a booleanfeature indicating whether the tweet is a retweet or not. User features: these features are proper of the user that sent the tweet, and are the same for all the tweets ofthis user. Namely we consider the name of the museum and the number of followers of the user.






This page was built for dataset: Dataset for the Article "A Predictive Method to Improve the Effectiveness of Twitter Communication in a Cultural Heritage Scenario"