🌲 Treez Data Personas

Per-role recommendations for leveraging the warehouse_transformations dbt project — mapped to real job needs.

📦 Snowflake · RETAIL_ANALYTICS 🔧 dbt 1.11 · Silver/Gold layers 🏪 ~665 active stores 📅 Generated Feb 2026
About this document: Each persona describes what that person cares about day-to-day, the specific dbt models and Snowflake tables that are most impactful for their work, and a recommended "deep dive" angle — a strategic way to unlock value from the data that aligns with their job goals. Models are drawn from the RETAIL_ANALYTICS database across schemas:
DBT_tenantDBT_inventoryDBT_ticketDBT_invoiceDBT_employeeDBT_customerDBT_accountingDBT_retail_analyticsDBT_discountDBT_external_share
👔

HM

Chief Executive Officer
  • Platform revenue health
  • Client retention & churn signals
  • Market expansion
  • High-level P&L
  • Investor-ready metrics
ra_hourly_sales
27.5M rows · DBT_retail_analytics

The heartbeat of the business. Aggregated sales by hour, ticket type, customer type, and revenue source. Shows real-time-ish revenue across all stores and compares rec vs. medical performance.

tenant_selltreez
665 rows · DBT_tenant

Master client list with total_transactions, last_transaction_date, and days_since_last_transaction. Instant view of which clients are active, growing, or at risk of churning.

store_daily_sales
4.7M rows · DBT_accounting

Daily sales by store with the full cost stack: base cost, true cost, excise. Clean view of gross margin across the entire portfolio — essential for understanding platform economics.

ra_customer_demographic_sales
184M rows · DBT_retail_analytics

End-consumer demographics across the Treez base: age groups, zip codes, gender, visit types, new vs. returning. Powerful for investor decks and market sizing conversations.

tenant_ingestion
665 rows · DBT_tenant

Data health per client. If stores stop streaming, that's a leading churn or technical-risk indicator. time_since_last_stream_hours is an operational health canary.

Build a weekly executive dashboard pulling from ra_hourly_sales + tenant_selltreez + store_daily_sales showing: total platform GMV, active store count trending, average revenue per store, and stores going dark (high days_since_last_transaction). Cross-reference against the master client sheet (ARPS + CSM assignments) to make churn risk actionable — every dark store should have a CSM on it within 48h.

🛠️

Joey

Product
  • Feature adoption rates
  • Product usage patterns
  • Catalog & inventory UX
  • Data-informed roadmap
  • Client segment behavior
tenant_catalog
665 rows · DBT_tenant

Which stores have catalog_enabled and epm_enabled (Enterprise Product on Menu). The fastest way to measure feature adoption rates for catalog functionality across the entire client base.

base_product_api_public_products
10.5M rows · DBT_base

Raw product data with all attributes: images, descriptions, barcodes, classifications, sizes. Measure product data quality — how many products have images? descriptions? — to prioritize catalog improvement work.

base_dispensary_crm_configuration
DBT_base

Every configuration flag per store. Joey can map feature adoption across the entire base and find underutilized capabilities that could be promoted or simplified.

checkin_time
192M rows · DBT_visit

Timestamps of every customer check-in event. Essential for understanding queue management, lobby flow, and patient experience if those features are on the roadmap.

batch_audit_log
DBT_inventory

Distribution of log_action types (Invoice Received, Count Adjustment, Loss, Waste, Transfer, etc.) reveals which inventory workflows are actually used vs. ignored by clients.

Cross-reference tenant_catalog feature flags with base_product_api_public_products data quality (% of products with images, descriptions, barcodes populated) to build a Catalog Health Score by store. Stores with catalog enabled but low data quality = product adoption friction. Pair this with Pendo page-view data on the catalog editor to see if low-quality stores also have low engagement — that's your highest-priority UX improvement target.

⚙️

Jeremy

Head of Engineering
  • Pipeline reliability & SLAs
  • Data quality
  • Ingestion latency
  • Infrastructure scaling
  • Incident detection
tenant_ingestion_table
37K rows · DBT_tenant

Per-store, per-table ingestion stats: last_sync, last_stream, time_since_last_stream_hours, and record counts. The single best place to build automated alerting for stale data.

tenant_ingestion
665 rows · DBT_tenant

Store-level rollup of snapshot/stream health. time_since_last_snapshot_hours and time_since_last_stream_hours are Jeremy's SLA monitors — a spike here means a Debezium connector is down.

base_debezium_meta_debezium_ingestion_connectors
2.9K rows · DBT_base

Connector metadata: types, purposes, linked datasources. Jeremy can see which connectors exist, track new additions, and spot connectors that have been deleted unexpectedly.

batch_daily_sales_and_inventory
Incremental · DBT_inventory

The most complex incremental model: date_series CROSS JOIN batch_fact, then filtered. Full refreshes are extremely expensive. Monitor last_sync drift between this model and its upstream batch_fact to detect pipeline bottlenecks.

inv_aud_adj_end_dates (seed)
DBT_seeds

Manual overrides for inventory audit end dates. Jeremy should understand when and why these corrections exist — they indicate known edge cases in the source data or ingestion logic.

Build an ops monitoring dashboard on tenant_ingestion_table showing: stores with streams stale >6h, tables with zero records, snapshot duration anomalies, and new stores missing their initial snapshot. The critical bottleneck to watch: base_inventory_tz_inventory_aud (the largest upstream table feeding batch logic) — if its last_stream falls behind, the entire inventory pipeline degrades. Also track the incremental last_sync lag: if batch_daily_sales_and_inventory is more than 2h behind batch_fact, something's queued up.

🗄️

Constantine

Database / Data Engineering
  • Data modeling quality
  • Query performance & cost
  • Incremental logic correctness
  • Reconciliation accuracy
  • Schema design
store_product_monthly_inventory_changes
Table · DBT_inventory · AZ-specific

The most complex model in the project. Full outer joins across three CTEs (product_net_sales, product_monthly_inventory, monthly_activity) plus the gram_formatter macro for unit conversion. This is where reconciliation math lives — and where silent bugs hide.

batch_daily_sales_and_inventory
Incremental (delete+insert) · DBT_inventory

Date_series CROSS JOIN batch_fact, joined to batch_last_log_by_date and batch_daily_sales. The incremental logic re-processes recent syncs AND active batches missing today's record. Most expensive model in the DAG — clustering on org_id, store_id should be validated against actual query patterns.

packaging_history_handler
Table · DBT_inventory

Reconciliation model that checks whether inventory activities + packaging history + sales net to zero. Four match flags: all_match, ticket_sales_match, packaging_match, packaging_sales_match. Accounting-grade data quality logic.

batch_audit_log
Incremental (delete+insert) · DBT_inventory

Inventory log reconstruction with 15+ CASE WHEN blocks mapping raw Debezium actions to human-readable log entries with before/after quantities. Business logic is intricate — the inventory_conversion_handler join for LOSS actions is especially subtle.

inventory_conversion_handler
63.7K rows · DBT_inventory

Handles the LOSS → Convert From edge case in inventory audit records by looking back at previous revision quantities. The self-join on base_inventory_tz_inventory_aud using revend = rev is the key pattern to understand.

Four focus areas: (1) Incremental cost optimization — the batch_all_dates CTE in batch_daily_sales_and_inventory uses a date_series CROSS JOIN (4K dates × millions of batches) even on incremental runs. The existing filter logic is good but worth benchmarking. (2) Commented-out delete filters — several models have --AND ia.deleted = FALSE commented out throughout the inventory models; understanding when and why those were removed is critical to data correctness. (3) The incremental_date macro uses correlated subqueries per org/store that could be expensive at scale — worth profiling. (4) Clustering key validation — nearly every model clusters on (org_id, store_id), but many queries filter only on store_id; check Snowflake query profiles to confirm pruning is effective.

📣

Beth

Marketing
  • Customer acquisition & retention
  • Consumer demographics
  • Campaign effectiveness
  • Product & brand trends
  • Thought leadership content
ra_customer_demographic_sales
184M rows · DBT_retail_analytics

The primary marketing model. Customer-level daily data with age group, gender, zip code, referral source, customer visit type (new vs. returning), ticket type, and revenue source. Build consumer personas and measure acquisition channel performance.

ra_hourly_sales
27.5M rows · DBT_retail_analytics

Revenue by hour with REVENUE_SOURCE and REVENUE_SOURCE_MIN_DATE. Shows how ecommerce vs. in-store revenue is trending across the client base — essential for marketing the Treez ecom product.

employee_brand_sales
44.7M rows · DBT_employee

Brand performance data: which brands sell best, which have the highest employee adoption. Useful for brand partnership marketing, content around top-performing brands, and retail insights reports.

customer_first_tx_timestamp
25M rows · DBT_customer

When each customer made their first Treez-powered purchase. Enables cohort analyses, measuring customer lifetime progression, and calculating time-to-first-purchase from signup date.

ticketline_discounts
392M rows · DBT_discount

Every discount applied, with title, type, method, and dollar amount. Analyze which promotions drive the most revenue, which are overused, and build case studies around promotional effectiveness for clients.

Build a "Treez Retail Insights" content engine — aggregate across all 665 stores (anonymized/blinded) to produce publishable benchmarks: average cannabis consumer demographics by state, peak shopping hours, top-performing product categories, impact of delivery on revenue mix, average basket size trends. This becomes quarterly thought leadership content, PR-worthy data, and sales ammunition simultaneously. The ra_customer_demographic_sales model has everything needed. Cross-reference with AlpineIQ campaign data for the full acquisition-to-retention funnel story.

📊

Richard

Director
  • Client health & expansion
  • Revenue per client
  • Upsell opportunities
  • Team performance
  • Retention risk signals
tenant_selltreez
665 rows · DBT_tenant

Client health at a glance: total_transactions, days_since_last_transaction, store_state, and org structure. Combined with ARPS from the master sheet, this is the foundation of a client health score.

store_daily_sales
4.7M rows · DBT_accounting

Daily revenue by store — rolling trends reveal whether clients are growing or declining. Clients with 3+ consecutive weeks of revenue decline warrant CSM intervention before they churn.

ra_invoices.v1
View · DBT_retail_analytics

Full invoice lifecycle: created → accepted → paid, with payment status, distributor info, and line-level detail. Active invoice volume is a proxy for how healthy a client's business is — declining invoices = trouble.

invoice_current_balance
DBT_invoice

AP health per client. Clients with large unpaid balances may be having cash flow issues (churn risk). Clients with rapidly growing AP volume are scaling — prime candidates for upsell conversations.

store_daily_payments
3.4M rows · DBT_accounting

Payment method mix by store and day. Clients still on cash-only are upsell targets for TreezPay/integrated payments. Payment diversification also correlates with business maturity.

Build a Client Health Score combining four signals: (1) Transaction recency from tenant_selltreez.days_since_last_transaction, (2) Revenue trend — rolling 90-day slope from store_daily_sales, (3) Payment diversity — count of distinct payment types from store_daily_payments, (4) Invoice activity — invoice count trend from invoice_current_balance. Weight these into a composite score and cross-reference with Intercom support ticket volume (high support + declining revenue = imminent churn). Deliver this as a weekly ranking to CSM leads.

🤝

Josh & Nick

Sales
  • Compelling demo data
  • Platform scale proof points
  • ROI metrics to cite
  • Understanding successful client profiles
  • Competitive positioning
ra_hourly_sales
27.5M rows · DBT_retail_analytics

Platform-level stats sales can cite in calls: "Our retailers process X million transactions monthly across Y states, with peak hours at Z." Aggregated and anonymized — safe to reference.

employee_hourly_sales_v2
54.8M rows · DBT_employee

Budtender performance metrics: tickets, average order value, transaction time, customer count per hour. Sales can demo this live: "Here's exactly how a manager sees their team's performance in Treez."

ra_inventory_performance.v2
View · DBT_retail_analytics

Product-level inventory + sales combined — units sold, units remaining, cost of inventory, average cost per unit, sellable vs. quarantine. This is the core inventory management value prop, demoable end-to-end.

ext_ticket_payments.v1
231.9M rows · DBT_external_share

In the DBT_EXTERNAL_SHARE schema — designed for client-facing data sharing. Sales should know these exist: they're what Treez can offer clients who want to pull their own data programmatically.

ra_budperf_sales_metrics_v1
View · DBT_retail_analytics

The live Budtender Performance dashboard view with hourly AND daily store comparisons. Demonstrating this in a sales call — showing an actual dispensary's employee performance data — is one of Treez's strongest differentiators.

Build a Sales Battle Card from aggregate platform stats: average transaction size, busiest shopping hours by state, top product categories, payment method adoption rates, average inventory turns. Source from ra_hourly_sales + store_daily_payments + employee_brand_sales. Additionally, identify your top 20 healthiest clients (from tenant_selltreez by transaction volume) and use their anonymized metrics as benchmarks during prospect conversations — "our top performers see X average order value and Y basket size."

🗂️

Scott Thatcher

Product Manager
  • Feature usage metrics
  • User workflow mapping
  • Data-driven prioritization
  • Client segment behavior
  • Pain point identification
batch_audit_log
Incremental · DBT_inventory

Distribution of log_action types (Invoice Received, Count Adjustment, Loss, Waste, Transfer, etc.) reveals which inventory workflows are actually in use. Low usage of certain actions = features clients don't understand or don't need.

ticketline_discounts
392M rows · DBT_discount

How many discount types exist per store, what % are custom vs. automatic vs. rewards. The discount workflow is one of the most complex features — usage distribution tells Scott where to simplify or improve.

base_dispensary_crm_configuration
DBT_base

Every config flag per store. Scott can build a feature adoption matrix: which features are on, for how many stores, correlated with store revenue and state. This is product roadmap fuel.

invoice_line
15.2M rows · DBT_invoice

Full invoice receiving workflow. What % of invoice lines have all cost fields populated vs. partial? Low completion rates indicate UX friction in the receiving flow — a specific, actionable improvement target.

checkin_time
192M rows · DBT_visit

Check-in timestamp data. Scott can measure the time delta between check-in and ticket close — proxy for wait time and transaction efficiency. Long waits = lobby or POS workflow problems worth solving.

Map the dbt feature surface area to Treez product features and measure adoption per feature using recent row counts: Inventory Managementbatch_fact, batch_audit_log · Receiving/Invoicinginvoice_line, invoice_payments · POS/Salesticketline_sales, ticket_payments · Discountingticketline_discounts · Employee Mgmtemployee_hourly_sales_v2 · Cash Managementshift_tickets, cash_handling_audit_log. Models with low recent activity = features nobody's using. Cross-reference with Pendo page views on the corresponding UI surfaces to confirm — if Snowflake says low activity AND Pendo says low clicks, that feature needs rethinking.

📋 Summary Matrix

Person Role Top Model Why It Matters
HM CEO ra_hourly_sales Platform revenue pulse + active store health
Joey Product tenant_catalog + base_product_api_public_products Feature adoption rates + catalog data quality
Jeremy Head of Eng tenant_ingestion_table Debezium pipeline health + SLA alerting
Constantine Database store_product_monthly_inventory_changes Most complex reconciliation — where bugs hide
Beth Marketing ra_customer_demographic_sales Consumer personas + thought leadership content
Richard Director tenant_selltreez + store_daily_sales Client health scoring + churn risk signals
Josh & Nick Sales ra_inventory_performance.v2 + employee_hourly_sales_v2 Demo value + platform-scale proof points
Scott Product Manager batch_audit_log + ticketline_discounts Feature usage mapping + UX improvement targets
Generated by Ting 🌲 · Treez Employee Agent · February 2026 · Based on dbt manifest v12 + catalog from warehouse_transformations project