Per-role recommendations for leveraging the warehouse_transformations dbt project — mapped to real job needs.
The heartbeat of the business. Aggregated sales by hour, ticket type, customer type, and revenue source. Shows real-time-ish revenue across all stores and compares rec vs. medical performance.
Master client list with total_transactions, last_transaction_date, and days_since_last_transaction. Instant view of which clients are active, growing, or at risk of churning.
Daily sales by store with the full cost stack: base cost, true cost, excise. Clean view of gross margin across the entire portfolio — essential for understanding platform economics.
End-consumer demographics across the Treez base: age groups, zip codes, gender, visit types, new vs. returning. Powerful for investor decks and market sizing conversations.
Data health per client. If stores stop streaming, that's a leading churn or technical-risk indicator. time_since_last_stream_hours is an operational health canary.
Build a weekly executive dashboard pulling from ra_hourly_sales + tenant_selltreez + store_daily_sales showing: total platform GMV, active store count trending, average revenue per store, and stores going dark (high days_since_last_transaction). Cross-reference against the master client sheet (ARPS + CSM assignments) to make churn risk actionable — every dark store should have a CSM on it within 48h.
Which stores have catalog_enabled and epm_enabled (Enterprise Product on Menu). The fastest way to measure feature adoption rates for catalog functionality across the entire client base.
Raw product data with all attributes: images, descriptions, barcodes, classifications, sizes. Measure product data quality — how many products have images? descriptions? — to prioritize catalog improvement work.
Every configuration flag per store. Joey can map feature adoption across the entire base and find underutilized capabilities that could be promoted or simplified.
Timestamps of every customer check-in event. Essential for understanding queue management, lobby flow, and patient experience if those features are on the roadmap.
Distribution of log_action types (Invoice Received, Count Adjustment, Loss, Waste, Transfer, etc.) reveals which inventory workflows are actually used vs. ignored by clients.
Cross-reference tenant_catalog feature flags with base_product_api_public_products data quality (% of products with images, descriptions, barcodes populated) to build a Catalog Health Score by store. Stores with catalog enabled but low data quality = product adoption friction. Pair this with Pendo page-view data on the catalog editor to see if low-quality stores also have low engagement — that's your highest-priority UX improvement target.
Per-store, per-table ingestion stats: last_sync, last_stream, time_since_last_stream_hours, and record counts. The single best place to build automated alerting for stale data.
Store-level rollup of snapshot/stream health. time_since_last_snapshot_hours and time_since_last_stream_hours are Jeremy's SLA monitors — a spike here means a Debezium connector is down.
Connector metadata: types, purposes, linked datasources. Jeremy can see which connectors exist, track new additions, and spot connectors that have been deleted unexpectedly.
The most complex incremental model: date_series CROSS JOIN batch_fact, then filtered. Full refreshes are extremely expensive. Monitor last_sync drift between this model and its upstream batch_fact to detect pipeline bottlenecks.
Manual overrides for inventory audit end dates. Jeremy should understand when and why these corrections exist — they indicate known edge cases in the source data or ingestion logic.
Build an ops monitoring dashboard on tenant_ingestion_table showing: stores with streams stale >6h, tables with zero records, snapshot duration anomalies, and new stores missing their initial snapshot. The critical bottleneck to watch: base_inventory_tz_inventory_aud (the largest upstream table feeding batch logic) — if its last_stream falls behind, the entire inventory pipeline degrades. Also track the incremental last_sync lag: if batch_daily_sales_and_inventory is more than 2h behind batch_fact, something's queued up.
The most complex model in the project. Full outer joins across three CTEs (product_net_sales, product_monthly_inventory, monthly_activity) plus the gram_formatter macro for unit conversion. This is where reconciliation math lives — and where silent bugs hide.
Date_series CROSS JOIN batch_fact, joined to batch_last_log_by_date and batch_daily_sales. The incremental logic re-processes recent syncs AND active batches missing today's record. Most expensive model in the DAG — clustering on org_id, store_id should be validated against actual query patterns.
Reconciliation model that checks whether inventory activities + packaging history + sales net to zero. Four match flags: all_match, ticket_sales_match, packaging_match, packaging_sales_match. Accounting-grade data quality logic.
Inventory log reconstruction with 15+ CASE WHEN blocks mapping raw Debezium actions to human-readable log entries with before/after quantities. Business logic is intricate — the inventory_conversion_handler join for LOSS actions is especially subtle.
Handles the LOSS → Convert From edge case in inventory audit records by looking back at previous revision quantities. The self-join on base_inventory_tz_inventory_aud using revend = rev is the key pattern to understand.
Four focus areas: (1) Incremental cost optimization — the batch_all_dates CTE in batch_daily_sales_and_inventory uses a date_series CROSS JOIN (4K dates × millions of batches) even on incremental runs. The existing filter logic is good but worth benchmarking. (2) Commented-out delete filters — several models have --AND ia.deleted = FALSE commented out throughout the inventory models; understanding when and why those were removed is critical to data correctness. (3) The incremental_date macro uses correlated subqueries per org/store that could be expensive at scale — worth profiling. (4) Clustering key validation — nearly every model clusters on (org_id, store_id), but many queries filter only on store_id; check Snowflake query profiles to confirm pruning is effective.
The primary marketing model. Customer-level daily data with age group, gender, zip code, referral source, customer visit type (new vs. returning), ticket type, and revenue source. Build consumer personas and measure acquisition channel performance.
Revenue by hour with REVENUE_SOURCE and REVENUE_SOURCE_MIN_DATE. Shows how ecommerce vs. in-store revenue is trending across the client base — essential for marketing the Treez ecom product.
Brand performance data: which brands sell best, which have the highest employee adoption. Useful for brand partnership marketing, content around top-performing brands, and retail insights reports.
When each customer made their first Treez-powered purchase. Enables cohort analyses, measuring customer lifetime progression, and calculating time-to-first-purchase from signup date.
Every discount applied, with title, type, method, and dollar amount. Analyze which promotions drive the most revenue, which are overused, and build case studies around promotional effectiveness for clients.
Build a "Treez Retail Insights" content engine — aggregate across all 665 stores (anonymized/blinded) to produce publishable benchmarks: average cannabis consumer demographics by state, peak shopping hours, top-performing product categories, impact of delivery on revenue mix, average basket size trends. This becomes quarterly thought leadership content, PR-worthy data, and sales ammunition simultaneously. The ra_customer_demographic_sales model has everything needed. Cross-reference with AlpineIQ campaign data for the full acquisition-to-retention funnel story.
Client health at a glance: total_transactions, days_since_last_transaction, store_state, and org structure. Combined with ARPS from the master sheet, this is the foundation of a client health score.
Daily revenue by store — rolling trends reveal whether clients are growing or declining. Clients with 3+ consecutive weeks of revenue decline warrant CSM intervention before they churn.
Full invoice lifecycle: created → accepted → paid, with payment status, distributor info, and line-level detail. Active invoice volume is a proxy for how healthy a client's business is — declining invoices = trouble.
AP health per client. Clients with large unpaid balances may be having cash flow issues (churn risk). Clients with rapidly growing AP volume are scaling — prime candidates for upsell conversations.
Payment method mix by store and day. Clients still on cash-only are upsell targets for TreezPay/integrated payments. Payment diversification also correlates with business maturity.
Build a Client Health Score combining four signals: (1) Transaction recency from tenant_selltreez.days_since_last_transaction, (2) Revenue trend — rolling 90-day slope from store_daily_sales, (3) Payment diversity — count of distinct payment types from store_daily_payments, (4) Invoice activity — invoice count trend from invoice_current_balance. Weight these into a composite score and cross-reference with Intercom support ticket volume (high support + declining revenue = imminent churn). Deliver this as a weekly ranking to CSM leads.
Platform-level stats sales can cite in calls: "Our retailers process X million transactions monthly across Y states, with peak hours at Z." Aggregated and anonymized — safe to reference.
Budtender performance metrics: tickets, average order value, transaction time, customer count per hour. Sales can demo this live: "Here's exactly how a manager sees their team's performance in Treez."
Product-level inventory + sales combined — units sold, units remaining, cost of inventory, average cost per unit, sellable vs. quarantine. This is the core inventory management value prop, demoable end-to-end.
In the DBT_EXTERNAL_SHARE schema — designed for client-facing data sharing. Sales should know these exist: they're what Treez can offer clients who want to pull their own data programmatically.
The live Budtender Performance dashboard view with hourly AND daily store comparisons. Demonstrating this in a sales call — showing an actual dispensary's employee performance data — is one of Treez's strongest differentiators.
Build a Sales Battle Card from aggregate platform stats: average transaction size, busiest shopping hours by state, top product categories, payment method adoption rates, average inventory turns. Source from ra_hourly_sales + store_daily_payments + employee_brand_sales. Additionally, identify your top 20 healthiest clients (from tenant_selltreez by transaction volume) and use their anonymized metrics as benchmarks during prospect conversations — "our top performers see X average order value and Y basket size."
Distribution of log_action types (Invoice Received, Count Adjustment, Loss, Waste, Transfer, etc.) reveals which inventory workflows are actually in use. Low usage of certain actions = features clients don't understand or don't need.
How many discount types exist per store, what % are custom vs. automatic vs. rewards. The discount workflow is one of the most complex features — usage distribution tells Scott where to simplify or improve.
Every config flag per store. Scott can build a feature adoption matrix: which features are on, for how many stores, correlated with store revenue and state. This is product roadmap fuel.
Full invoice receiving workflow. What % of invoice lines have all cost fields populated vs. partial? Low completion rates indicate UX friction in the receiving flow — a specific, actionable improvement target.
Check-in timestamp data. Scott can measure the time delta between check-in and ticket close — proxy for wait time and transaction efficiency. Long waits = lobby or POS workflow problems worth solving.
Map the dbt feature surface area to Treez product features and measure adoption per feature using recent row counts: Inventory Management → batch_fact, batch_audit_log · Receiving/Invoicing → invoice_line, invoice_payments · POS/Sales → ticketline_sales, ticket_payments · Discounting → ticketline_discounts · Employee Mgmt → employee_hourly_sales_v2 · Cash Management → shift_tickets, cash_handling_audit_log. Models with low recent activity = features nobody's using. Cross-reference with Pendo page views on the corresponding UI surfaces to confirm — if Snowflake says low activity AND Pendo says low clicks, that feature needs rethinking.
| Person | Role | Top Model | Why It Matters |
|---|---|---|---|
| HM | CEO | ra_hourly_sales | Platform revenue pulse + active store health |
| Joey | Product | tenant_catalog + base_product_api_public_products | Feature adoption rates + catalog data quality |
| Jeremy | Head of Eng | tenant_ingestion_table | Debezium pipeline health + SLA alerting |
| Constantine | Database | store_product_monthly_inventory_changes | Most complex reconciliation — where bugs hide |
| Beth | Marketing | ra_customer_demographic_sales | Consumer personas + thought leadership content |
| Richard | Director | tenant_selltreez + store_daily_sales | Client health scoring + churn risk signals |
| Josh & Nick | Sales | ra_inventory_performance.v2 + employee_hourly_sales_v2 | Demo value + platform-scale proof points |
| Scott | Product Manager | batch_audit_log + ticketline_discounts | Feature usage mapping + UX improvement targets |