On Approximating String Selection Problems with Outliers

From MaRDI portal
Publication:2904515

DOI10.1007/978-3-642-31265-6_34zbMATH Open1358.68121arXiv1202.2820OpenAlexW2165751994MaRDI QIDQ2904515

Author name not available (Why is that?)

Publication date: 14 August 2012

Published in: (Search for Journal in Brave)

Abstract: Many problems in bioinformatics are about finding strings that approximately represent a collection of given strings. We look at more general problems where some input strings can be classified as outliers. The Close to Most Strings problem is, given a set S of same-length strings, and a parameter d, find a string x that maximizes the number of "non-outliers" within Hamming distance d of x. We prove this problem has no PTAS unless ZPP=NP, correcting a decade-old mistake. The Most Strings with Few Bad Columns problem is to find a maximum-size subset of input strings so that the number of non-identical positions is at most k; we show it has no PTAS unless P=NP. We also observe Closest to k Strings has no EPTAS unless W[1]=FPT. In sum, outliers help model problems associated with using biological data, but we show the problem of finding an approximate solution is computationally difficult.


Full work available at URL: https://arxiv.org/abs/1202.2820




No records found.








This page was built for publication: On Approximating String Selection Problems with Outliers

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2904515)