 sql >> Base de Dados >  >> RDS >> Oracle

Obtenha os 10 melhores produtos para cada categoria

Provavelmente há razões para não usar funções analíticas, mas usar funções analíticas sozinhas :
select am, rf, rfm, rownum_rf2, rownum_rfm
    -- the 3nd level takes the subproduct ranks, and for each equally ranked
    -- subproduct, it produces the product ranking
    select am, rf, rfm, rownum_rfm,
      row_number() over (partition by rownum_rfm order by rownum_rf) rownum_rf2
        -- the 2nd level ranks (without ties) the products within
        -- categories, and subproducts within products simultaneosly
        select am, rf, rfm,
          row_number() over (partition by am order by count_rf desc) rownum_rf,
          row_number() over (partition by am, rf order by count_rfm desc) rownum_rfm
            -- inner most query counts the records by subproduct
            -- using regular group-by. at the same time, it uses
            -- the analytical sum() over to get the counts by product
            select, ttc.rf, ttc.rfm,
              count(*) count_rfm,
              sum(count(*)) over (partition by, ttc.rf) count_rf
            from tg inner join ttc on tg.value = ttc.value
            group by, ttc.rf, ttc.rfm
        ) X
    ) Y
    -- at level 3, we drop all but the top 5 subproducts per product
    where rownum_rfm <= 5   -- top  5 subproducts
) Z
-- the filter on the final query retains only the top 10 products
where rownum_rf2 <= 10  -- top 10 products
order by am, rownum_rf2, rownum_rfm;

Eu usei rownum em vez de rank para que você nunca tenha empates, ou em outras palavras, os empates serão decididos aleatoriamente. Isso também não funciona se os dados não forem densos o suficiente (menos de 5 subprodutos em qualquer um dos 10 principais produtos - pode mostrar subprodutos de alguns outros produtos). Mas se os dados forem densos (grande banco de dados estabelecido), a consulta deve funcionar bem.

O abaixo faz duas passagens dos dados, mas retorna resultados corretos em cada caso. Novamente, esta é uma consulta de classificação sem vínculos.
select am, rf, rfm, count_rf, count_rfm, rownum_rf, rownum_rfm
    -- next join the top 10 products to the data again to get
    -- the subproduct counts
    select, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf, count(*) count_rfm,
        ROW_NUMBER() over (partition by, tg.rf order by 1 desc) rownum_rfm
    from (
        -- first rank all the products
        select, tg.value, ttc.rf, count(*) count_rf,
            ROW_NUMBER() over (order by 1 desc) rownum_rf
        from tg
        inner join ttc on tg.value = ttc.value
        group by, tg.value, ttc.rf
        order by count_rf desc
        ) tg
    inner join ttc on tg.value = ttc.value and tg.rf = ttc.rf
    -- filter the inner query for the top 10 products only
    where rownum_rf <= 10
    group by, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf
) X
-- filter where the subproduct rank is in top 5
where rownum_rfm <= 5
order by am, rownum_rf, rownum_rfm;

count_rf : count of sales by product
count_rfm : count of sales by subproduct
rownum_rf : product rank within category (rownumber - without ties)
rownum_rfm : subproduct rank within product (without ties)