Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
OIE4PA: Open Information Extraction for the Public Administration - MaRDI portal

Deprecated: Use of MediaWiki\Skin\SkinTemplate::injectLegacyMenusIntoPersonalTools was deprecated in Please make sure Skin option menus contains `user-menu` (and possibly `notifications`, `user-interface-preferences`, `user-page`) 1.46. [Called from MediaWiki\Skin\SkinTemplate::getPortletsTemplateData in /var/www/html/w/includes/Skin/SkinTemplate.php at line 691] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

Deprecated: Use of QuickTemplate::(get/html/text/haveData) with parameter `personal_urls` was deprecated in MediaWiki Use content_navigation instead. [Called from MediaWiki\Skin\QuickTemplate::get in /var/www/html/w/includes/Skin/QuickTemplate.php at line 131] in /var/www/html/w/includes/Debug/MWDebug.php on line 372

OIE4PA: Open Information Extraction for the Public Administration

From MaRDI portal



DOI10.5281/zenodo.8331106Zenodo8331106MaRDI QIDQ6718885

Dataset published at Zenodo repository.

Author name not available (Why is that?)

Publication date: 9 September 2023

Copyright license: No records found.



Tenders are powerful means of investment of public funds and represent a strategic development resource. Despite the efforts made so far by governments at national and international levels to digitalise documents related to the Public Administration sector, most of the information is still available in an unstructured format only. With the aim of bridging this gap, we present OIE4PA, our latest study on extracting and classifying relations from tenders of the Public Administration. Our work focuses on the Italian language, where the availability of linguistic resources to perform Natural Language Processing tasks is considerably limited. For evaluation purposes, we built a dataset composed of 2,000 triples extracted from Italian tenders, which have been manually annotated by two human experts. The dataset, compressed in a single zip file,is composed of: The corpus of 6,262 texts extracted from Italian public tenders (corpus_tenders) The training set of 1,600annotated triples (training_set) The test set of 400annotated triples (test_set) The set Uof 14,096triples used for the self-training (u_triples_dd) a compressed archive that contains both the extracted triples and the index for each supervised approach (extraction)






This page was built for dataset: OIE4PA: Open Information Extraction for the Public Administration