前回では、
コンバージョンパスに着目した応用KPI分析
まず、
SELECT
COUNT(1) AS cnt,
AVG(path_length) AS avg_path_length,
AVG(last_time-landing_time) AS avg_conversion_time
FROM (
SELECT
cookie, cv_id,
MAX(node_id) AS path_length,
MIN(time) AS landing_time,
MAX(time) AS cv_time
FROM conversion_path
GROUP BY cookie, cv_id
) t1
JOIN (
SELECT
cookie,
cv_id,
MAX(path) as last_node,
MAX(time) as last_time
FROM conversion_path
WHERE 1 < node_id AND cv_flag = 0
GROUP BY 1,2
) t2 ON (t1.cookie=t2.cookie AND t1.cv_id=t2.cv_id)
結果例
2015年1月には5053回のコンバージョン
※ これらの値はサンプルデータに基づくもので、
cnt | avg_ | avg_ |
---|---|---|
5053 | 1047. | 66692. |
コンバージョンパス長の分布
先ほどは
SELECT
path_length,
COUNT(1) AS cnt
FROM (
SELECT
cookie, cv_id,
MAX(node_id) AS path_length
FROM conversion_path
GROUP BY cookie, cv_id
) t1
GROUP BY path_length
result = _
result.DataFrame().plot(x='path_length', xlim=(0,200), ylim=(0,200), figsize=(12, 10))
可視化してみると、

コンバージョン時間の分布
同様にコンバージョン時間の分布も見てみましょう。
SELECT CEIL(cv_time/width)*width AS x, COUNT(1) AS cnt
FROM
(
SELECT (last_time-landing_time)/3600 AS cv_time
FROM (
SELECT
cookie, cv_id,
MAX(node_id) AS path_length,
MIN(time) AS landing_time,
MAX(time) AS cv_time
FROM conversion_path
GROUP BY cookie, cv_id
) temp1
JOIN (
SELECT
cookie,
cv_id,
MAX(path) as last_node,
MAX(time) as last_time
FROM conversion_path
WHERE 1 < node_id AND cv_flag = 0
GROUP BY 1,2
) temp2 ON (temp1.cookie=temp2.cookie AND temp1.cv_id=temp2.cv_id)
) t1,
(
SELECT POW(10,floor(LOG10(MAX( (last_time-landing_time)/3600 )))-1) AS width
FROM (
SELECT
cookie, cv_id,
MAX(node_id) AS path_length,
MIN(time) AS landing_time,
MAX(time) AS cv_time
FROM conversion_path
GROUP BY cookie, cv_id
) temp3
JOIN (
SELECT
cookie,
cv_id,
MAX(path) as last_node,
MAX(time) as last_time
FROM conversion_path
WHERE 1 < node_id AND cv_flag = 0
GROUP BY 1,2
) temp4 ON (temp3.cookie=temp4.cookie AND temp3.cv_id=temp4.cv_id)
) t2
GROUP BY ceil(cv_time/width)*width
ORDER BY x
result = _
result.DataFrame().plot(x='x', figsize=(12, 10))
コンバージョン時間もパス長と同じように、

コンバージョン率(全体)
非コンバージョンを保持していましたので、
SELECT cv_cnt, uncv_cnt, cv_cnt*1.0/(cv_cnt+uncv_cnt) AS cv_ratio
FROM
(
SELECT
COUNT(1) as cv_cnt
FROM (
SELECT
cookie,
cv_id
FROM conversion_path
GROUP BY 1,2
) t1
) t2 JOIN (
SELECT
COUNT(1) as uncv_cnt
FROM (
SELECT cookie
FROM non_conversion_path
GROUP BY 1
) t3
) t4 ON 1=1
コンバージョン率は9%と、
cv_ | uncv_ | cv_ |
---|---|---|
5359 | 46500 | 0. |
パス類型
さて今まで全体のコンバージョンパスを見ていましたが、
スルーコンバージョン
自社のコンバージョンにつながりそうなページを取り上げて、
SELECT cv_cnt, uncv_cnt, cv_cnt*1.0/(cv_cnt+uncv_cnt) AS cv_ratio
FROM
(
SELECT
COUNT(1) as cv_cnt
FROM
(
SELECT
cookie, cv_id
FROM conversion_path
WHERE path = '/treasuredata.com/jp/case-studies'
GROUP BY 1,2) t1
) cv,
(
SELECT COUNT(1) AS uncv_cnt FROM (
SELECT cookie
FROM non_conversion_path
WHERE path = '/treasuredata.com/jp/case-studies'
GROUP BY 1 ) t2
) uncv
こうしてみると、
cv_ | uncv_ | cv_ |
---|---|---|
3338 | 303 | 0. |
ランディングコンバージョン
ランディングページは、
SELECT cv_cnt, uncv_cnt, cv_cnt*1.0/(cv_cnt+uncv_cnt) AS cv_ratio
FROM
(
SELECT
COUNT(1) as cv_cnt
FROM
(
SELECT
cookie, cv_id, time,
MIN(path) AS landing_path
FROM conversion_path
GROUP BY cookie, cv_id, time
) t1
WHERE landing_path = '/treasuredata.com/jp/case-studies'
) cv,
(
SELECT COUNT(1) AS uncv_cnt
FROM (
SELECT
cookie, time,
MIN(path) AS landing_path
FROM non_conversion_path
GROUP BY cookie, time
) t2
WHERE landing_path = '/treasuredata.com/jp/case-studies'
) uncv
cv_ | uncv_ | cv_ |
---|---|---|
176 | 0 | 1. |
この結果は非常に有意義で、
直前コンバージョン
コンバージョンの直前のページは、
SELECT
last_node,
COUNT(1) AS cnt
FROM (
SELECT
cookie,
cv_id,
MAX(path) as last_node
FROM conversion_path
WHERE 1 < node_id AND cv_flag = 0
GROUP BY 1,2
) t1
GROUP BY 1
result = _
r = result.DataFrame().set_index("last_node")
r.plot(kind='pie', y='cnt', figsize=(15, 15))
last_ | cnt |
---|---|
/treasuredata. | 3379 |
/treasuredata. | 1405 |
/treasuredata. | 98 |
/treasuredata. | 34 |
/treasuredata. | 33 |

基本/応用KPI分析まとめ
第6回から3回に渡って、
さらに、
次回は、