We can either provide false sense of security to users of those 1.7%
of
sites or provide much better security to users of nearly 18% of sites
that also support other cipher suites.

IMHO, it is a situation similar to, say, the case when you get an expired
server certificate. The program should refuse to talk to such server by
default but it should explain the problem to the user and make it possible
to override the default policy (*).
(*) Yes, I am *painfully aware* that 99 out of 100 users are idiots who
should never be trusted to make such decisions but I think we'd better
leave the business of making software attempting to compensate for
intellectual deficiencies of its user to, ahem, certain other vendors.
RC4 is broken. While the latest attack against it does require few
million
connections (which we know that some actors already do collect) it also,
for all intents and purposes, has computational complexity of 0. Researchers
performed only 256 tries for their guesses - that's 8bit computational
complexity for the attack.

Well, you need to process those millions of collected ciphertexts too...
But yes, ~10^6 ops are still close enough to zero nowadays.
Also note that a service that connects to a site every minute for a
year
will reach the needed threshold for the easiest attack that recovers "only"
3 bytes with 100% accuracy.

You assume the client keeps sending the same secret data for a year. This
might be possible just as it might be possible to coax the client to spam
the server with a rapid sequence of repeated requests but (in both cases)
with the important "under certain circumstances" qualification.
Also, perfect 100% accuracy (= zero probability of error) is impossible
(see below) but you can get as close as you desire.
Now, unless someone has done the maths, I'm going to use
conservative
estimate and assume a one to one ratio for memory-time trade off. That
means that RC4 has 38 bit computational complexity of attack for a
capture of just few connections. That's export grade crypto level.

I might have missed something here but I am afraid you are mistaken. As
far I know the attack--or its simple single-byte variant--works as
follows:
You are interested in some byte b of plaintext that its transmitted
repeatedly and you know it always corresponds to n-th byte of ciphertext.
You assume some prior probability distribution P(b = x) of possible
values of b.
You observe the interesting byte c of ciphertext and get the probability
distribution P(k_n = y) of the n-th byte of keystream, assuming a randomly
chosen keys. Compute the posterior prob. distribution P' for values of b
assuming ciphertext c by Bayesian inference.
The result would be the same as the original of P(b = x) if P(k_n = y)
were uniform. But it is biased and this makes some values of the unknown
plaintext byte b slightly more probable a posteriori than they were a
priori (at the expense of other values). Repeat (with priors replaced
by posteriors) with more ciphertexts until the probability of one value
grows large enough.
(AlFardan et al. describe a different and possibly somewhat more efficient
approach to find the maximum-likelihood choice of a plaintext byte.)
If you capture "just few" connections you do not collect enough
information to distinguish between the correct value and incorrect values,
and no amount of computing power will help you (that is unless you have
got enough to crack the cipher directly).
--
Pavel Kankovsky aka Peak "Que sais-je?"