Oracle
 sql >> Base de Dados >  >> RDS >> Oracle

contar o número de linhas que ocorrem para cada data no intervalo de datas da coluna

WITH    q AS
        (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    mytable
                ) + level - 1 AS mydate
        FROM    dual
        CONNECT BY
                level <= (
                SELECT  MAX(end_date) - MIN(start_date)
                FROM    mytable
                )
        )
SELECT  group, mydate,
        (
        SELECT  COUNT(*)
        FROM    mytable mi
        WHERE   mi.group = mo.group
                AND q BETWEEN mi.start_date AND mi.end_date
        ) 
FROM    q
CROSS JOIN
        (
        SELECT  DISTINCT group
        FROM    mytable
        ) mo

Atualização:

Uma consulta melhor e mais rápida utilizando funções analíticas.

A ideia principal é que o número de intervalos contendo cada data é a diferença antes da contagem de intervalos iniciados antes dessa data e a contagem de intervalos que terminaram antes dela.
SELECT  cur_date,
        grouper,
        SUM(COALESCE(scnt, 0) - COALESCE(ecnt, 0)) OVER (PARTITION BY grouper ORDER BY cur_date) AS ranges
FROM    (
        SELECT  (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + level - 1 AS cur_date
        FROM    dual
        CONNECT BY
                level <=
                (
                SELECT  MAX(end_date)
                FROM    t_range
                ) -
                (
                SELECT  MIN(start_date)
                FROM    t_range
                ) + 1
        ) dates
CROSS JOIN
        (
        SELECT  DISTINCT grouper AS grouper
        FROM    t_range
        ) groups
LEFT JOIN
        (
        SELECT  grouper AS sgrp, start_date, COUNT(*) AS scnt
        FROM    t_range
        GROUP BY
                grouper, start_date
        ) starts
ON      sgrp = grouper
        AND start_date = cur_date
LEFT JOIN
        (
        SELECT  grouper AS egrp, end_date, COUNT(*) AS ecnt
        FROM    t_range
        GROUP BY
                grouper, end_date
        ) ends
ON      egrp = grouper
        AND end_date = cur_date - 1
ORDER BY
        grouper, cur_date

Esta consulta é concluída em 1 segundo em 1,000,000 linhas.

Veja esta entrada no meu blog para mais detalhes: