O
ORDER BY
mais à esquerda os itens não podem discordar dos itens do DISTINCT
cláusula. Cito o manual sobre DISTINCT
:Tentar:
SELECT *
FROM (
SELECT DISTINCT ON (c.cluster_id, feed_id)
c.cluster_id, num_docs, feed_id, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
ORDER BY c.cluster_id, feed_id, num_docs, url_time
-- first columns match DISTINCT
-- the rest to pick certain values for dupes
-- or did you want to pick random values for dupes?
) x
ORDER BY num_docs DESC;
Ou use
GROUP BY
:SELECT c.cluster_id
, num_docs
, feed_id
, url_time
FROM url_info u
JOIN cluster_info c ON (c.cluster_id = u.cluster_id)
WHERE feed_id IN (SELECT pot_seeder FROM potentials)
AND num_docs > 5
AND url_time > '2012-04-16'
GROUP BY c.cluster_id, feed_id
ORDER BY num_docs DESC;
Se
c.cluster_id, feed_id
são as colunas de chave primária de todas as tabelas (ambos neste caso) das quais você inclui colunas no SELECT
list, então isso só funciona com PostgreSQL 9.1 ou mais tarde. Caso contrário, você precisa
GROUP BY
o restante das colunas ou agregar ou fornecer mais informações.