Smart Search, Documents and RAG

Certified Lead Developer

My client is concerned about reducing token usage, and we are investigating how to prevent repeated tokenization and embedding.

We understand that Document Management Record Types embed the document representations for use in the smart search.  Are these embeddings used system wide when document content is provided to an AI Agent or Skill?

Are they only available if the document is referenced using the special system document type?

Help us understand.

Thanks!

  Discussion posts and replies are publicly visible

  •  

    Smart search itself tracks and checks for actual data changes from each data sync (which triggers indexing), and will only re/index new or changed rows of data when creating or updating embeddings.

    Additionally, Agentic usage of smart search is reusing the same embeddings for records search, and document search will in the future as well. (There are some notable exceptions here, for example: If you are using Agent Studio and doc extraction -- passing in a document as an input or a tool uses the Agent Index, leveraging a doc extraction tool called by an agent will use the doc extraction index.)

    For AI Skills, there's no underlying connection between AI Skills and Smart Search/Embeddings. When you load a record document into AI Skills, tokens are consumed for the entire documents i.e. no embedding info is used. Now if you're querying using smart search and sending chunks of the document to AI Skills as text, only those tokens are consumed. Overall, new tokens are used every time a document is used with AI Skills.

  • 0
    Certified Lead Developer
    in reply to Derek Knoderer

    Thanks Derek, so what I understand you to be saying, is that there are multiple siloed embedding locations

    1. Documents chat (already knew about that)
    2. Smart Search (NOT available except to anything but the smart search?)
    3. The "Agent Index" (I have never heard about this one)
    4. Document Extraction (I think I know what you are referring to here)

    Would you mind clarifying number 3.

    • The "Agent Index" how does that work, what records and documents are accessible to it, and what features can utilize it? 
    • Are you saying, that the embedding content is *only* available by the use of an Agent?
    • Are you saying that the Agent Index, and the smart search embeddings will be merged in the future, or are already approximately synonyms?

    Appreciate the the help!

    K/R