Podcast Reviews Example

This is an example data application where we produce text insights from podcast review data. It is made up of N datasets:

Raw reviews (date, podcast, text, rating)
Podcasts (podcast, title, category)
Categorized review text (date, category, podcast, text)
Phrase models (date, category, hash, ngram, score)
Podcast phrase stats (date, category, podcast, ngram, count, rating)
Podcast daily summary (date, category, podcast, phrase_stats, recent_reviews)

flowchart LR
    raw_reviews[(Raw Reviews)] & podcasts[(Podcasts)] --> categorize_text --> categorized_texts[(Categorized Texts)]
    categorized_texts --> phrase[Phrase Modeling] --> phrase_models[(Phrase Models)]
    phrase_models & raw_reviews --> phrase_stats --> podcast_phrase_stats[(Podcast Phrase Stats)]
    podcast_phrase_stats & raw_reviews --> calc_summary --> podcast_daily_summary[(Podcast Daily Summary)]

Input Data

Get it from here! (and put it in examples/podcast_reviews/data/ingest/database.sqlite)

`phrase` Dependency

This relies on soaxelbrooke/phrase for phrase extraction - check out its releases to get a relevant binary.

1.3 KiB Raw Blame History

Podcast Reviews Example

Input Data

phrase Dependency

1.3 KiB

Raw Blame History

`phrase` Dependency