๐Ÿ–ฅ๏ธ ์ฝ”๋”ฉํ…Œ์ŠคํŠธ/SQL

[DataLemur] [Twitter] Histogram of Tweets

rtw0202 2026. 3. 11. 21:56

1. ๋ฌธ์ œ ์„ค๋ช…

ํŠธ์œ„ํ„ฐ ํŠธ์œ— ๋ฐ์ดํ„ฐ ํ…Œ์ด๋ธ”์ด ์ฃผ์–ด์กŒ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ , 2022๋…„ ์‚ฌ์šฉ์ž๋ณ„ ํŠธ์œ— ๊ฒŒ์‹œ ํšŸ์ˆ˜ ํžˆ์Šคํ† ๊ทธ๋žจ์„ ์ƒ์„ฑํ•˜๋Š” ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•˜์„ธ์š”. ๊ฐ ์‚ฌ์šฉ์ž๋ณ„ ํŠธ์œ— ๊ฒŒ์‹œ ํšŸ์ˆ˜๋ฅผ ๊ทธ๋ฃน์œผ๋กœ ๋ฌถ๊ณ , ํ•ด๋‹น ๊ทธ๋ฃน์— ์†ํ•œ ํŠธ์œ„ํ„ฐ ์‚ฌ์šฉ์ž ์ˆ˜๋ฅผ ์ถœ๋ ฅํ•˜์„ธ์š”. ์ฆ‰, 2022๋…„ ํŠธ์œ— ๊ฒŒ์‹œ ํšŸ์ˆ˜๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์‚ฌ์šฉ์ž๋ฅผ ๊ทธ๋ฃนํ™”ํ•˜๊ณ  ๊ฐ ๊ทธ๋ฃน์— ์†ํ•œ ์‚ฌ์šฉ์ž ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜์„ธ์š”.

tweets ํ…Œ์ด๋ธ”์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 

2. ์ถœ๋ ฅ ์˜ˆ์‹œ

 
3. ๋ฌธ์ œ ๋‹ต์•ˆ

SELECT 
  t.tweet_bucket,
  COUNT(DISTINCT t.user_id) AS users_num
FROM (SELECT
        user_id,
        COUNT(DISTINCT tweet_id) AS tweet_bucket
      FROM tweets
      WHERE YEAR(tweet_date) = '2022'
      GROUP BY user_id
      ) t
GROUP BY t.tweet_bucket
ORDER BY t.tweet_bucket;