This page also provides some insight into how the semantic match and LLM are making their recommendations.
Commodity Code Recommendation
Displays the best-fit 8-digit UNSPSC code and title, plus the brief explanation (commodity_answer
) the LLM gives for its choice.
Semantic Search Matches
Lists the ten commodity codes with the highest cosine similarity (“Distance”) to your prompt.Nearest-Neighbor Commodity Code Details
Expands those ten neighbors into their full UNSPSC hierarchy (Segment → Family → Class → Commodity), complete with definitions and notes.Nearest-Neighbor t-SNE Visualization
Plots the same ten neighbors in 2-D space (pre-computed t-SNE slice) so you can see how tightly they cluster.Closest Object Code
Shows the recommended North Carolina object code and title, plus the ten most similar object codes with similarity scores.
Note: an LLM rationale for this object-code choice is not yet returned; addobject_answer
to surface it.Attribute Highlights
The /attributes endpoint passes your prompt to Gemini to extract key structured attributes (e.g., material, unit of measure, industry tags). The JSON is rendered as an expandable list for quick copy/paste.Sample Products & Pricing
The /products endpoint queries eBay and DummyJSON to pull up to five real-world SKUs, prices, and seller links that match the recommended commodity code—useful for sanity-checking market fit.Full Market Analysis Report
Clicking “Generate Report” triggers the /report endpoint: Gemini writes a multi-section HTML analysis (overview, demand factors, supplier landscape, cost drivers, etc.). The file is uploaded to Cloud Storage and the page redirects you to the public link.
The data pipeline visualization below traces the movement from user prompt and raw data to on-screen report. Latency was trimmed by caching, parallel calls, and pre-computing heavy work, while costs were controlled with spot compute, right-sizing, static hosting, and token/egress limits.
Stage | What happens | Latency-reduction tactics | Cost-management tactics |
---|---|---|---|
Data Sources | UNSPSC and NC object-code CSVs live in Cloud Storage. | Keep files in the same region as Cloud Run. |
|
Offline Pre-processing |
|
Heavy math happens once, not per request. |
|
User Request (runtime) |
|
|
|
Attribute Extraction Branch /attributes endpoint |
|
|
|
Products Lookup Branch /products endpoint |
|
|
|
Market Report Generation /report endpoint |
|
|
|
Cloud Run API | rag_pipeline.py merges all branch outputs and gzip-compresses JSON. |
Min instances = 1 keeps a warm container. |
|
Frontend (index.html + JS) |
Renders tables, hierarchy, t-SNE plot, attribute and product panels. | Lazy-load heavy libs only when tabs open. |
|
On-screen Reports & Visuals | User sees results in ≈ 2 s. | Reuse cached JSON for instant tab switching. |
|
Commodity Codes: UNSPSC
Object Codes: NCCCS Chart of Accounts