Solution 1:

If all weights are with a factor W of each other, then you can get a
W-approximation by just pretending they're all equal and applying
the linear-time solution mentioned in your question's last paragraph.


One can get an efficient ​ $\ln(Q_k)$-approximation ​ by
using Algorithm 3.1.7 and finding the $\operatorname{argmax}$s as follows:

Vertices in the same column are not adjacent, so cliques have at most one vertex in each column. ​ ​ ​ Brute-force over the possibilities for which set P of not-already-covered columns the
non-empty clique uses. ​ ​ ​ (There are at only 2$^k$-1 possibilities, and if ​ 1 < k ​ then ​ 2$^k$-1 < Qk .)
For each column in P, find the minimum weight of not-already-covered vertices
whose superscripts each include the indices of the other columns in P.
If any elements of P have no such vertices then P does not induce any cliques,
else with M being the maximum of those minimum weights,
[M is the weight of the minimum-weight cliques that induce P] and
[those cliques are formed by choosing, for each column in P, exactly one of the vertices in that column which is such that [the vertex's superscript includes the indices of all other columns in P] and [the vertex's weight is at most M]].



By "subgraph" in the rest of this answer, I mean induced.


Pretending the not-in-the-subgraph vertices are already covered turns that into
a poly$(Q_k)$-time ln(n)-approximation for n-vertex subgraphs of your graphs.
Observe that solutions always restrict to no-more-expensive solutions to subgraphs, so in particular the minimum cost for a subgraph cannot exceed the minimum cost for the original graph.
Thus, what I described two sentences ago can be used as a heuristic to hopefully get better
lower bounds, and as a special case of that, yields an algorithm which provably achieves better approximation bounds when most weights are much smaller than the maximum weight:

Split the graph into a large low-weight part and a small (n-vertex) high-weight part.
Combining a cover of the low-weight part with a ln(n)-approximation for
the high-weight part yields a cover of the input graph whose cost is at most
[cost for low-weight part] ​ ​ ​ + ​ ​ ​ [ ln(n) ​ $\cdot$ ​ [optimum cost for input graph] ] ​ .


If n is sufficiently small, then you can do even better by using an
Exponential-Time Approximation of Weighted Set Cover on the high-weight part.