Enter a product or service and the search returns a commodity code and object code.

This page also provides some insight into how the semantic match and LLM are making their recommendations.

object code description and the LLM’s reasoning for selecting the commodity code.

  • Commodity Code Recommendation Shows the best-fit 8-digit UNSPSC commodity code and title, along with a brief explanation from the LLM to support its choice.

  • Semantic Search Matches Lists the ten commodity codes that rank highest by cosine similarity to the user’s prompt, plus each similarity score (labelled “Distance”).

  • Nearest-Neighbor Commodity Code Details Expands each of those ten neighbors into their full UNSPSC hierarchy (Segment → Family → Class → Commodity), so users see contextual definitions and notes.

  • Nearest-Neighbor t-SNE Visualization Plots the same ten neighbors in 2-D space using a pre-computed t-SNE array, giving a visual sense of how close they are to one another.

  • Closest Object Code Displays the recommended North Carolina object code and title and a table of the ten most semantically similar object codes with their similarity scores.

The data pipeline visualization below traces the movement from user prompt and raw data to on-screen report. Latency was trimmed by caching, parallel calls, and pre-computing heavy work, while costs were controlled with spot compute, right-sizing, static hosting, and token/egress limits.

Stage What happens Latency-reduction tactics Cost-management tactics
Data Sources UNSPSC and NC object-code CSVs are stored in Cloud Storage. Keep files in a regional bucket near the Cloud Run service.
  • Archive tier + lifecycle rules move untouched CSVs to Nearline after 30 days.
  • Retrieval is cheap because preprocessing runs only when a catalog update is published.
Offline Pre-processing
  • Batch Embedding Pipeline → Vertex AI embedding-001
  • Populate Vector Search Index
  • Pre-compute t-SNE coordinates
Heavy math done once, not per request.
  • Run batch job on pre-emptible GCE VMs (spot pricing, ~70 % cheaper).
  • Submit embeddings in 1 000-record batches to cut per-request overhead.
  • Schedule t-SNE job with Cloud Scheduler so you pay only when it runs.
User Request (runtime)
  • Embed the user prompt.
  • Vector query → top 10 commodity codes.
  • Gemini Pro call returns recommendation + rationale.
  • Second vector query → closest object codes.
  • 64 MB in-RAM LRU cache avoids repeat embeddings.
  • Parallel commodity/object searches.
  • Short prompt using “latest-short” Gemini model.
  • Cap Gemini calls to 300 output tokens.
  • Batch API billing via Pub/Sub buffer to stay in free tier during spikes.
  • Set searchTimeoutMs to skip paying for slow queries.
Cloud Run API rag_pipeline.py merges everything and returns gzip-compressed JSON. Min instances = 1 keeps a warm container.
  • CPU = 1 / Memory = 2 GiB (right-sized from load tests).
  • Autoscaling floor = 0 at night/weekends in dev.
  • Image < 250 MB to cut build/pull minutes.
Frontend (index.html + JS) Renders tables, expandable hierarchy, and the t-SNE plot. Lazy-load chart libraries only when the viz tab opens.
  • Host static assets on GCS static website (≈¼ the cost of Cloud Run).
  • Bundle & minify JS/CSS to reduce egress.
  • Use Cloudflare (free tier) for CDN caching.
On-screen Reports & Visuals User sees results in ≈ 2 s. Reuse cached JSON for instant tab switching.
  • Keep client-side libraries < 40 kB gzip to lower bandwidth fees, especially on mobile.

Commodity Codes: UNSPSC

Object Codes: NCCCS Chart of Accounts