This page also provides some insight into how the semantic match and LLM are making their recommendations.

Commodity Code Recommendation
Displays the best-fit 8-digit UNSPSC code and title, plus the brief explanation (commodity_answer) the LLM gives for its choice.

Semantic Search Matches
Lists the ten commodity codes with the highest cosine similarity (“Distance”) to your prompt.
Nearest-Neighbor Commodity Code Details
Expands those ten neighbors into their full UNSPSC hierarchy (Segment → Family → Class → Commodity), complete with definitions and notes.
Nearest-Neighbor t-SNE Visualization
Plots the same ten neighbors in 2-D space (pre-computed t-SNE slice) so you can see how tightly they cluster.
Closest Object Code
Shows the recommended North Carolina object code and title, plus the ten most similar object codes with similarity scores.
Note: an LLM rationale for this object-code choice is not yet returned; add object_answer to surface it.
Attribute Highlights
The /attributes endpoint passes your prompt to Gemini to extract key structured attributes (e.g., material, unit of measure, industry tags). The JSON is rendered as an expandable list for quick copy/paste.
Sample Products & Pricing
The /products endpoint queries eBay and DummyJSON to pull up to five real-world SKUs, prices, and seller links that match the recommended commodity code—useful for sanity-checking market fit.
Full Market Analysis Report
Clicking “Generate Report” triggers the /report endpoint: Gemini writes a multi-section HTML analysis (overview, demand factors, supplier landscape, cost drivers, etc.). The file is uploaded to Cloud Storage and the page redirects you to the public link.

The data pipeline visualization below traces the movement from user prompt and raw data to on-screen report. Latency was trimmed by caching, parallel calls, and pre-computing heavy work, while costs were controlled with spot compute, right-sizing, static hosting, and token/egress limits.

  
      Stage
      What happens
      Latency-reduction tactics
      Cost-management tactics
    
      Data Sources
      UNSPSC and NC object-code CSVs live in Cloud Storage.
      Keep files in the same region as Cloud Run.
      
        Lifecycle moves idle CSVs to Nearline after 30 days.
Pre-processing runs only on catalog updates.

      Offline Pre-processing
      
        Batch Embedding Pipeline → Vertex AI embedding-001.
Populate Vector Search Index.
Pre-compute t-SNE coordinates.

      Heavy math happens once, not per request.
      
        Run on pre-emptible VMs (≈70 % cheaper).
1 000-row batches lower per-call overhead.
Scheduler triggers jobs; pay only when running.

      User Request (runtime)
      
        Embed user prompt.
Vector query → top 10 commodity codes.
Gemini Pro returns recommendation + rationale.
Second vector query → closest object codes.

        64 MB in-RAM LRU cache for embeddings.
Parallel commodity / object searches.
“latest-short” Gemini model, short prompt.

        Cap Gemini output to 300 tokens.
Batch API via Pub/Sub to ride the free tier.
searchTimeoutMs limits paid stall time.

      Attribute Extraction Branch
/attributes endpoint
      
        Gemini Pro extracts structured JSON attributes from the prompt.

        Tiny prompt ⇒ sub-200 ms response.
Caching identical prompts.

        Use “latest-short” model to halve cost.
Throttle to 20 req/min with Cloud Run concurrency.

      Products Lookup Branch
/products endpoint
      
        Queries eBay & DummyJSON providers for sample SKUs + prices.

        Async HTTP calls; results cached 1 h.
Return top 5 items only.

        Use free-tier APIs in dev.
Provider throttle and 1-hr cache cut egress.

      Market Report Generation
/report endpoint
      
        Gemini Pro writes multi-section HTML analysis.
Uploads finished report to GCS; user redirected.

        Stream Gemini output; user sees spinner.
Runs in background worker to free API thread.

        Generate only on explicit request.
Store static HTML; no regen cost on repeat views.
Lifecycle policy archives reports after 6 months.

      Cloud Run API
      rag_pipeline.py merges all branch outputs and gzip-compresses JSON.
      Min instances = 1 keeps a warm container.
      
        CPU 1 / RAM 2 GiB (right-sized).
Autoscaling floor 0 nights/weekends.
Image < 250 MB for faster builds.

      Frontend
(index.html + JS)
      Renders tables, hierarchy, t-SNE plot, attribute and product panels.
      Lazy-load heavy libs only when tabs open.
      
        Static hosting on GCS website (¼ Cloud Run cost).
Bundle & minify to cut egress.
Cloudflare (free) CDN caching.

      On-screen Reports & Visuals
      User sees results in ≈ 2 s.
      Reuse cached JSON for instant tab switching.
      
        Client-side libs < 40 kB gzip reduce bandwidth fees.

Commodity Codes: UNSPSC

Object Codes: NCCCS Chart of Accounts