Double Digest Revisited: Complexity and Approximability in the Presence of Noisy Data | |
Authors: | Mark Cieliebak, Stephan Eidenbenz, and Gerhard J. Woeginger |
Reference: | Technical Report 382, ETH Zurich, Department of Computer Science, November 2002 |
Download: | Postscript (.ps) or PDF (.pdf) |
Abstract: |
We revisit the Double Digest problem, which occurs
in sequencing of large DNA strings and consists of reconstructing the
relative positions of cut sites from two different enzymes: we first show
that double digest is strongly NP-complete, improving previous results
that only showed weak NP-completeness. Even the (experimentally more meaningful)
variation in which we disallow coincident cut sites turns out to be strongly
NP-complete. In a second part, we model errors in data as they occur in
real-life experiments: we propose several optimization variations of Double
Digest that model partial cleavage errors, which occur for various reasons.
We then show APX-completeness for most of these variations. In a third
part, we investigate these variations with the additional restriction
that conincident cut sites are disallowed and we show that it is NP-hard
to even find feasible solutions in this case, thus making it impossible
to guarantee any approximation ratio at all.
|
Remarks: |