sherpa_ai package

In This Page:

sherpa_ai package#

Overview#

The sherpa_ai package is a comprehensive AI framework for building intelligent, multi-agent systems that can tackle complex problems through strategic planning, reasoning, and collaboration. It provides a complete ecosystem of agents, actions, planners, and memory systems that can be combined and customized for various applications.

Key Components

  • Agents: Domain-specific AI agents with different expertise and capabilities

  • Actions: Specialized operations that agents can perform to accomplish tasks

  • Policies: Decision-making strategies for agent behavior and reasoning

  • Memory Systems: Persistent storage for knowledge, beliefs, and conversation history

  • Models: Interfaces with various LLM providers with enhanced logging and error handling

  • Prompts: Template-based system for creating and managing prompts for language models

  • Tools: Utility functions and interfaces for common AI operations

  • Reflection: Capabilities for self-evaluation and improvement

  • Events: Event-driven architecture for coordinating system components

  • Test Utilities: Testing tools for data, language models, and logging

Installation#

The Sherpa AI package can be installed via pip:

pip install sherpa-ai

Subpackages#

Package

Description

sherpa_ai.actions

Collection of specialized actions that agents can perform to accomplish tasks.

sherpa_ai.agents

Specialized AI agents with different roles and expertise for various domains.

sherpa_ai.config

Configuration management tools for customizing system behavior.

sherpa_ai.connectors

Interfaces for connecting to external systems and databases.

sherpa_ai.database

Database interaction capabilities for persistence and analytics.

sherpa_ai.error_handling

Error management tools for recovery and stability.

sherpa_ai.memory

Knowledge persistence and sharing across sessions and components.

sherpa_ai.runtime

Runtime environment for executing agents asynchronously.

sherpa_ai.models

Interfaces with language model providers and enhanced model functionality.

sherpa_ai.output_parsers

Tools for validating and transforming model outputs.

sherpa_ai.policies

Decision-making strategies for agents to handle different scenarios.

sherpa_ai.prompts

Template-based system for creating, loading, and formatting prompts.

sherpa_ai.scrape

Utilities for extracting information from files and repositories.

sherpa_ai.test_utils

Testing utilities for data, language models, and logging.

sherpa_ai.verbose_loggers

Advanced logging capabilities for debugging and monitoring.

Submodules#

Module

Description

sherpa_ai.events

Event system for coordinating component interactions and message passing.

sherpa_ai.output_parsers

Tools for parsing and processing model outputs into usable formats.

sherpa_ai.post_processors

Processes model outputs for additional formatting and enhancement.

sherpa_ai.prompt

Tools for creating and managing prompts for language models.

sherpa_ai.prompt_generator

Dynamic prompt generation based on context and requirements.

sherpa_ai.reflection

Self-evaluation and improvement capabilities for agents.

sherpa_ai.tools

Utility functions and tools for common AI operations.

sherpa_ai.utils

General utility functions used throughout the framework.

sherpa_ai.events module#

class sherpa_ai.events.Event(**data)[source]#

Bases: BaseModel, ABC

Base class for all events in the system.

This abstract base class defines the common interface for all event types in the system. It inherits from both Pydantic’s BaseModel for data validation and ABC for abstract base class functionality.

name#

Name of the event, used for identification and logging.

Type:

str

Example

>>> class CustomEvent(Event):
...     pass
>>> event = CustomEvent(name="custom_event")
>>> print(event.name)
custom_event
name: str#

Name of the event.

sender: str#
event_type: str#
model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.events.GenericEvent(**data)[source]#

Bases: Event

A flexible event type that can handle various event scenarios.

This class provides a generic event implementation that can store any type of content and event type. It’s particularly useful for handling user inputs and task registrations.

This class inherits from Event and provides methods to:
  • Store and retrieve event content

  • Handle different event types

  • Provide string representation of events

name#

The name of the event.

Type:

str

content#

The content or payload associated with the event.

Type:

Any

event_type#

The type of the event, defaults to “generic”.

Type:

str

Example

>>> from sherpa_ai.events import GenericEvent
>>> event = GenericEvent(name="user_message", content="Hello, world!", event_type="user_input")
>>> print(event.name)
user_message
>>> print(event.content)
Hello, world!
>>> print(event.event_type)
user_input

Notes

  • The event_type attribute should be customized but defaults to “generic”.

name: str#

The name of the event.

content: Any#

The content or payload associated with the event. This can be any data relevant to the event.

event_type: str#

The type of the event. Defaults to “generic”.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.events.TriggerEvent(**data)[source]#

Bases: Event

Event to trigger a state transition.

This class represents an event that triggers a state transition in the system. It captures the event name and its associated arguments. This is useful for handling user inputs and message communications through the shared memory.

name#

The name of the event being triggered.

Type:

str

content#

A dictionary containing the arguments passed to the action.

Type:

Any

event_type#

The type of the event, fixed to “trigger”.

Type:

str

Example

>>> from sherpa_ai.events import TriggerEvent
>>> event = TriggerEvent(name="user_input", args={"message": "Hello, world!"})
>>> print(event.name)
user_input
name: str#

The name of the event being triggered.

args: dict[str, Any]#

A dictionary containing the arguments passed to the action.

event_type: str#

The type of the event, which is always set to “action_start”.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.events.ActionStartEvent(**data)[source]#

Bases: Event

Event triggered when an action begins execution.

This class represents the start of an action execution, capturing the action name and its arguments. It’s used for tracking and logging action execution flow.

This class inherits from Event and provides methods to:
  • Track action execution start

  • Store action arguments

  • Provide immutable event type

name#

The name of the action being executed.

Type:

str

args#

Dictionary containing the arguments passed to the action.

Type:

dict[str, Any]

event_type#

The type of the event, fixed to “action_start”.

Type:

str

Example

>>> from sherpa_ai.events import ActionStartEvent
>>> event = ActionStartEvent(name="fetch_data", args={"url": "https://example.com"})
>>> print(event.name)
fetch_data
>>> print(event.args)
{'url': 'https://example.com'}
>>> print(event.event_type)
action_start

Notes

  • The event_type attribute is fixed to “action_start” and cannot be changed.

name: str#

The name of the action being executed.

args: dict[str, Any]#

A dictionary containing the arguments passed to the action.

event_type: str#

The type of the event, which is always set to “action_start”.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.events.ActionFinishEvent(**data)[source]#

Bases: Event

Event triggered when an action completes execution.

This class represents the completion of an action execution, capturing the action name and its outputs. It’s used for tracking and logging action execution results.

This class inherits from Event and provides methods to:
  • Track action execution completion

  • Store action outputs

  • Provide immutable event type

name#

The name of the action that was executed.

Type:

str

outputs#

The outputs or results produced by the action.

Type:

Any

event_type#

The type of the event, fixed to “action_finish”.

Type:

str

Example

>>> from sherpa_ai.events import ActionFinishEvent
>>> event = ActionFinishEvent(name="fetch_data", outputs={"status": "success", "data": [1, 2, 3]})
>>> print(event.name)
fetch_data
>>> print(event.outputs)
{'status': 'success', 'data': [1, 2, 3]}
>>> print(event.event_type)
action_finish

Notes

  • The event_type attribute is fixed to “action_finish” and should not be changed.

name: str#

The name of the action that was executed.

outputs: Any#

The outputs or results produced by the action.

event_type: str#

The type of the event, which is always set to “action_finish”.

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sherpa_ai.events.build_event(event_type, name, **kwargs)[source]#

Factory function to create appropriate Event objects based on event type.

This function creates and returns the appropriate Event subclass instance based on the specified event_type. It handles specialized events like ActionStartEvent and ActionFinishEvent, falling back to GenericEvent for other event types.

Parameters:
  • event_type (str) – The type of event to create. Special handling for “action_start” and “action_finish”. All other values create a GenericEvent.

  • name (str) – The name of the event.

  • **kwargs – Additional keyword arguments to pass to the event constructor. For “action_start”: Should include ‘args’ dictionary. For “action_finish”: Should include ‘outputs’ with results. For other event types: Should include ‘content’ for the event payload.

Returns:

An instance of the appropriate Event subclass:
  • ActionStartEvent for “action_start” event_type

  • ActionFinishEvent for “action_finish” event_type

  • GenericEvent for all other event_type values

Return type:

Event

Example

>>> from sherpa_ai.events import build_event
>>> start_event = build_event("action_start", "fetch_data", args={"url": "example.com"})
>>> finish_event = build_event("action_finish", "fetch_data", outputs={"status": "success"})
>>> generic_event = build_event("user_input", "user_query", content="How are you?")

sherpa_ai.output_parsers module#

Output parsing and validation module for Sherpa AI.

This module provides various parsers and validators for processing model outputs. It includes parsers for links, Markdown to Slack conversion, and validators for citations, numbers, and entities.

Example

>>> from sherpa_ai.output_parsers import LinkParser, NumberValidation
>>> link_parser = LinkParser()
>>> links = link_parser.parse("Check out https://example.com")
>>> number_validator = NumberValidation()
>>> result = number_validator.validate("The answer is 42")
class sherpa_ai.output_parsers.LinkParser[source]#

Bases: BaseOutputParser

Parser for converting between links and symbolic references.

This class handles the conversion of URLs to symbolic references and vice versa, maintaining a consistent mapping between them. It can process both raw URLs and tool-generated output containing links.

Attributes:

links (list): List of unique links encountered during parsing. link_to_id (dict): Mapping of links to their symbolic references. count (int): Counter for generating unique symbol IDs. output_counter (int): Counter for reindexing output symbols. reindex_mapping (dict): Mapping of original IDs to reindexed IDs. url_pattern (str): Regex pattern for identifying links. doc_id_pattern (str): Regex pattern for identifying document IDs. link_symbol (str): Format string for link symbols.

Example:
>>> parser = LinkParser()
>>> text = "Check Link:example.com and Link:test.com"
>>> result = parser.parse_output(text, tool_output=True)
>>> print(result)
'DocID:[1]

DocID:[2] ‘

>>> back = parser.parse_output("[1] and [2]")
>>> print(back)
'<http://example.com|[1]> and <http://test.com|[2]>'
parse_output(text, tool_output=False)[source]#

Parse and transform links in text.

This method either converts URLs to symbolic references (when tool_output is True) or converts symbolic references back to clickable links (when tool_output is False).

Args:

text (str): Text containing either URLs or symbolic references. tool_output (bool): Whether the input is from a tool (True) or

user-facing text (False).

Returns:
str: Text with either URLs converted to symbols or symbols

converted to clickable links.

Example:
>>> parser = LinkParser()
>>> # Convert URLs to symbols
>>> result = parser.parse_output("Link:example.com", tool_output=True)
>>> print(result)
'DocID:[1]
>>> # Convert symbols back to links
>>> result = parser.parse_output("[1]")
>>> print(result)
'<http://example.com|[1]>'
Return type:

str

class sherpa_ai.output_parsers.MDToSlackParse[source]#

Bases: BaseOutputParser

Parser for converting Markdown links to Slack format.

This class converts Markdown-style links ([text](url)) to Slack’s link format (<url|text>). It maintains the link text and URL while changing only the syntax to match Slack’s requirements.

pattern#

Regex pattern for identifying Markdown links.

Type:

str

Example

>>> parser = MDToSlackParse()
>>> text = "Check out [this link](http://example.com)!"
>>> result = parser.parse_output(text)
>>> print(result)
'Check out <http://example.com|this link>!'
parse_output(text)[source]#

Convert Markdown links to Slack format.

This method finds all Markdown-style links in the input text and converts them to Slack’s link format while preserving the link text and URL.

Parameters:

text (str) – Text containing Markdown-style links.

Returns:

Text with links converted to Slack format.

Return type:

str

Example

>>> parser = MDToSlackParse()
>>> text = "See [docs](https://docs.com) and [code](https://code.com)"
>>> result = parser.parse_output(text)
>>> print(result)
'See <https://docs.com|docs> and <https://code.com|code>'
class sherpa_ai.output_parsers.CitationValidation(sequence_threshold=0.7, jaccard_threshold=0.7, token_overlap=0.7)[source]#

Bases: BaseOutputProcessor

Validator and citation adder for text content.

This class analyzes text against source materials to validate content and add appropriate citations. It uses multiple similarity metrics to determine when citations are needed and which sources to cite.

sequence_threshold#

Minimum ratio of common subsequence length to text length for citation. Default is 0.7.

Type:

float

jaccard_threshold#

Minimum Jaccard similarity for citation. Default is 0.7.

Type:

float

token_overlap#

Minimum token overlap ratio for citation. Default is 0.7.

Type:

float

Example

>>> validator = CitationValidation(sequence_threshold=0.8)
>>> belief = Belief()  # Contains source about "Python is great"
>>> result = validator.process_output("Python is great!", belief)
>>> print("[1]" in result.result)  # Has citation
True
calculate_token_overlap(sentence1, sentence2)[source]#

Calculates the percentage of token overlap between two sentences.

This method tokenizes both sentences and calculates the percentage of shared tokens relative to each sentence’s length.

Parameters:
  • sentence1 (str) – First sentence to compare.

  • sentence2 (str) – Second sentence to compare.

Returns:

(overlap_ratio_1, overlap_ratio_2) where each ratio is the

proportion of shared tokens to total tokens in that sentence.

Return type:

tuple

Example

>>> validator = CitationValidation()
>>> ratio1, ratio2 = validator.calculate_token_overlap(
...     "The cat is black",
...     "The cat is white"
... )
>>> print(f"{ratio1:.2f}, {ratio2:.2f}")
'0.75, 0.75'
jaccard_index(sentence1, sentence2)[source]#

Calculates the Jaccard index between two sentences.

This method computes the Jaccard index (intersection over union) between the sets of tokens from both sentences.

Parameters:
  • sentence1 (str) – First sentence to compare.

  • sentence2 (str) – Second sentence to compare.

Returns:

Jaccard similarity score between 0 and 1.

Return type:

float

Example

>>> validator = CitationValidation()
>>> score = validator.jaccard_index(
...     "The cat is black",
...     "The cat is white"
... )
>>> print(f"{score:.2f}")
'0.60'
longest_common_subsequence(text1, text2)[source]#

Calculate length of longest common subsequence.

This method finds the length of the longest subsequence of characters that appear in both texts in the same order.

Parameters:
  • text1 (str) – First text to compare.

  • text2 (str) – Second text to compare.

Returns:

Length of longest common subsequence.

Return type:

int

Example

>>> validator = CitationValidation()
>>> length = validator.longest_common_subsequence(
...     "hello world",
...     "hello there"
... )
>>> print(length)
6
flatten_nested_list(nested_list)[source]#

Flatten a nested list of strings.

Parameters:

nested_list (list[list[str]]) – List of lists of strings.

Returns:

Single list containing all non-empty strings.

Return type:

list[str]

Example

>>> validator = CitationValidation()
>>> flat = validator.flatten_nested_list([["a", "b"], ["c", ""]])
>>> print(flat)
['a', 'b', 'c']
split_paragraph_into_sentences(paragraph)[source]#

Split paragraph into sentences using NLTK.

Parameters:

paragraph (str) – Text to split into sentences.

Returns:

List of sentences from the paragraph.

Return type:

list[str]

Example

>>> validator = CitationValidation()
>>> sentences = validator.split_paragraph_into_sentences(
...     "Hello there. How are you?"
... )
>>> print(sentences)
['Hello there.', 'How are you?']
resources_from_belief(belief)[source]#

Extract resources from belief state actions.

Parameters:

belief (Belief) – Agent’s belief state containing actions.

Returns:

List of resources from retrieval actions.

Return type:

list[ActionResource]

Example

>>> validator = CitationValidation()
>>> belief = Belief()  # Contains retrieval action with resource
>>> resources = validator.resources_from_belief(belief)
>>> print(len(resources))
1
process_output(text, belief, **kwargs)[source]#

Process text and add citations from belief resources.

This method analyzes the input text against resources in the belief state and adds citations where appropriate based on similarity metrics.

Parameters:
  • text (str) – Text to process and add citations to.

  • belief (Belief) – Agent’s belief state containing resources.

  • **kwargs – Additional arguments for processing.

Returns:

Result containing text with citations added.

Return type:

ValidationResult

Example

>>> validator = CitationValidation()
>>> belief = Belief()  # Contains source about "Python"
>>> result = validator.process_output(
...     "Python is a great language.",
...     belief
... )
>>> print("[1]" in result.result)  # Has citation
True
add_citation_to_sentence(sentence, resources)[source]#

Add citations to a single sentence.

This method checks the sentence against each resource using similarity metrics to determine which sources to cite.

Parameters:
  • sentence (str) – Sentence to add citations to.

  • resources (list[ActionResource]) – Available citation sources.

Returns:

a list of citation identifiers citation_links: a list of citation links (URLs)

Return type:

citation_ids

Example

>>> validator = CitationValidation()
>>> resource = ActionResource(
...     source="http://example.com",
...     content="Python is great"
... )
>>> ids, urls = validator.add_citation_to_sentence(
...     "Python is great!",
...     [resource]
... )
>>> print(len(ids), urls[0])
1 http://example.com
format_sentence_with_citations(sentence, ids, links)[source]#

Format a sentence with its citations.

This method adds citation references to the end of a sentence in the format [id](url).

Parameters:
  • sentence (str) – Sentence to add citations to.

  • ids (list[int]) – Citation ID numbers.

  • links (list[str]) – Citation URLs.

Returns:

Sentence with citations added.

Return type:

str

Example

>>> validator = CitationValidation()
>>> result = validator.format_sentence_with_citations(
...     "Python is great.",
...     [1],
...     ["http://example.com"]
... )
>>> print(result)
'Python is great [1](http://example.com).'
add_citations(text, resources)[source]#
Return type:

ValidationResult

get_failure_message()[source]#
Return type:

str

class sherpa_ai.output_parsers.NumberValidation[source]#

Bases: BaseOutputProcessor

Validates the presence or absence of numerical information in a given piece of text.

This class validates that any numbers mentioned in generated text can be found in the source material, helping ensure numerical accuracy and prevent hallucination of numbers.

Example

>>> validator = NumberValidation()
>>> belief = Belief()  # Contains source text with "42 items"
>>> result = validator.process_output("There are 42 items.", belief)
>>> print(result.is_valid)
True
>>> result = validator.process_output("There are 100 items.", belief)
>>> print(result.is_valid)
False
process_output(text, belief, **kwargs)[source]#

Verifies that all numbers within text exist in the belief source text.

Parameters:
  • text (str) – Text containing numbers to validate.

  • belief (Belief) – Agent’s belief state containing source material.

  • **kwargs – Additional arguments for processing.

Returns:

Result indicating whether all numbers are valid,

with feedback if validation fails.

Return type:

ValidationResult

Example

>>> validator = NumberValidation()
>>> belief = Belief()  # Contains "The price is $50"
>>> result = validator.process_output("It costs $50", belief)
>>> print(result.is_valid)
True
>>> print(result.feedback)
''
get_failure_message()[source]#

Get a message describing validation failures.

Returns:

Warning message about potential numerical inaccuracies.

Return type:

str

Example

>>> validator = NumberValidation()
>>> print(validator.get_failure_message())
'The numeric value results might not be fully reliable...'
class sherpa_ai.output_parsers.EntityValidation[source]#

Bases: BaseOutputProcessor

Validator for named entities in text.

This class validates that entities mentioned in generated text can be found in the source material, using progressively more sophisticated similarity comparison methods if initial validation fails.

Example

>>> validator = EntityValidation()
>>> belief = Belief()  # Contains source text about "John Smith"
>>> result = validator.process_output("John Smith is CEO.", belief)
>>> print(result.is_valid)
True
>>> result = validator.process_output("Jane Doe is CEO.", belief)
>>> print(result.is_valid)
False
process_output(text, belief, llm=None, **kwargs)[source]#

Validate entities in text against source material.

This method checks that entities mentioned in the input text can be found in the source material stored in the belief state. It uses increasingly sophisticated comparison methods on validation failures.

Parameters:
  • text (str) – Text containing entities to validate.

  • belief (Belief) – Agent’s belief state containing source material.

  • llm (BaseLanguageModel, optional) – Language model for advanced comparison.

  • **kwargs – Additional arguments for processing.

Returns:

Result indicating whether all entities are valid,

with feedback if validation fails.

Return type:

ValidationResult

Example

>>> validator = EntityValidation()
>>> belief = Belief()  # Contains text about "Microsoft"
>>> result = validator.process_output("Microsoft announced...", belief)
>>> print(result.is_valid)
True
>>> print(result.feedback)
''
similarity_picker(value)[source]#

Select text similarity comparison method.

This method determines which similarity comparison method to use based on the number of previous validation attempts.

Parameters:

value (int) – The iteration count value used to determine the text similarity state. 0: Use BASIC text similarity. 1: Use text similarity BY_METRICS. Default: Use text similarity BY_LLM.

Returns:

Selected comparison method.

Return type:

TextSimilarityMethod

Example

>>> validator = EntityValidation()
>>> method = validator.similarity_picker(0)
>>> print(method)
TextSimilarityMethod.BASIC
>>> method = validator.similarity_picker(2)
>>> print(method)
TextSimilarityMethod.LLM
get_failure_message()[source]#

Get a message describing validation failures.

Returns:

Warning message about potential missing entities.

Return type:

str

Example

>>> validator = EntityValidation()
>>> print(validator.get_failure_message())
'Some enitities from the source might not be mentioned.'
check_entities_match(result, source, stage, llm)[source]#

Check if entities in result match those in source.

This method compares entities between result and source text using the specified similarity comparison method.

Parameters:
  • result (str) – Text containing entities to validate.

  • source (str) – Source text to validate against.

  • stage (TextSimilarityMethod) – Comparison method to use.

  • llm (BaseLanguageModel) – Language model for LLM-based comparison.

Returns:

Whether entities match and error message if not.

Return type:

Tuple[bool, str]

Example

>>> validator = EntityValidation()
>>> match, msg = validator.check_entities_match(
...     "Apple released...",
...     "Apple announced...",
...     TextSimilarityMethod.BASIC,
...     None
... )
>>> print(match)
True

sherpa_ai.post_processors module#

Post-processors for outputs from the LLM.

Convert Markdown links to Slack links.

Parameters:

text (str) – The input text containing Markdown links.

Returns:

The input text with Markdown links converted to Slack links.

Return type:

str

Example

>>> text = "Check out the [website](https://example.com) for more information."
>>> result = md_link_to_slack(text)
>>> print(result)
Check out the <https://example.com|website> for more information.

sherpa_ai.prompt module#

class sherpa_ai.prompt.SlackBotPrompt(*args, **kwargs)[source]#

Bases: BaseChatPromptTemplate

A chat prompt template for a Slack bot.

This class extends BaseChatPromptTemplate to provide a specialized prompt template for Slack bots. It handles message formatting, token counting, and chat history processing.

ai_name#

The name of the AI assistant.

Type:

str

ai_role#

The role of the AI assistant.

Type:

str

tools#

List of tools available to the AI.

Type:

List[BaseTool]

token_counter#

Function to count tokens in a string.

Type:

Callable[[str], int]

send_token_limit#

Maximum number of tokens to send in a message. Defaults to 4196.

Type:

int

Example

>>> from sherpa_ai.prompt import SlackBotPrompt
>>> prompt = SlackBotPrompt(
...     ai_name="Assistant",
...     ai_role="Helper",
...     tools=[],
...     token_counter=lambda x: len(x.split())
... )
>>> messages = prompt.format_messages(task="Hello", user_input="Hi")
>>> print(len(messages))
3
ai_name: str#
ai_role: str#
tools: List[BaseTool]#
token_counter: Callable[[str], int]#
send_token_limit: int#
construct_base_prompt()[source]#

Construct the base prompt for the AI assistant.

This method generates the foundational system prompt that defines the AI’s identity, role, and available tools. It combines the AI’s name with the prompt generated from available tools.

Returns:

The complete base prompt string.

Return type:

str

Example

>>> from sherpa_ai.prompt import SlackBotPrompt
>>> prompt = SlackBotPrompt(
...     ai_name="Assistant",
...     ai_role="Helper",
...     tools=[],
...     token_counter=lambda x: len(x.split())
... )
>>> base = prompt.construct_base_prompt()
>>> print(base.startswith("You are"))
True
format_messages(**kwargs)[source]#

Format messages for the bot, including system prompts and chat history.

This method constructs a list of messages including: - Base system prompt defining the bot’s identity - Current time and date - Recent chat history (up to token limit) - Current task and user input

Parameters:

**kwargs (Any) – Keyword arguments containing: - task (str): The current task description - messages (List[dict]): Previous chat messages - user_input (str): The user’s input message

Returns:

List of formatted messages ready for the model.

Return type:

List[BaseMessage]

Example

>>> prompt = SlackBotPrompt(
...     ai_name="Assistant",
...     ai_role="Helper",
...     tools=[],
...     token_counter=lambda x: len(x.split())
... )
>>> messages = prompt.format_messages(
...     task="Help user",
...     messages=[],
...     user_input="Hello"
... )
>>> print(len(messages))
3
process_chat_history(messages)[source]#

Process raw chat history into formatted message objects.

This method converts raw chat messages into appropriate message objects (AIMessage or HumanMessage) and handles message formatting.

Parameters:

messages (List[dict]) – List of raw chat messages, each containing: - type (str): Message type (“message” or “text”) - user (str): User identifier - text (str): Message content

Returns:

List of processed message objects.

Return type:

List[BaseMessage]

Example

>>> prompt = SlackBotPrompt(
...     ai_name="Assistant",
...     ai_role="Helper",
...     tools=[],
...     token_counter=lambda x: len(x.split())
... )
>>> raw_messages = [
...     {"type": "message", "user": "user1", "text": "Hello"},
...     {"type": "message", "user": "assistant", "text": "Hi"}
... ]
>>> processed = prompt.process_chat_history(raw_messages)
>>> print(len(processed))
2
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sherpa_ai.prompt_generator module#

class sherpa_ai.prompt_generator.PromptGenerator[source]#

Bases: object

A class for generating structured prompt strings for AI agents.

This class provides functionality to create well-structured prompts for AI agents by combining constraints, commands, resources, and performance evaluations. It generates prompts in a consistent format that can be easily parsed by the agent.

This class provides methods to:
  • Add constraints to guide agent behavior

  • Add tools/commands the agent can use

  • Add resources available to the agent

  • Add performance evaluation criteria

  • Generate formatted prompt strings

constraints#

List of constraints that guide agent behavior.

Type:

List[str]

commands#

List of tools/commands available to the agent.

Type:

List[BaseTool]

resources#

List of resources available to the agent.

Type:

List[str]

performance_evaluation#

List of performance evaluation criteria.

Type:

List[str]

response_format#

JSON structure defining the expected response format.

Type:

dict

Example

>>> from sherpa_ai.prompt_generator import PromptGenerator
>>> generator = PromptGenerator()
>>> generator.add_constraint("Always be helpful")
>>> generator.add_tool(tool)
>>> prompt = generator.generate_prompt_string()
>>> print(prompt)
Constraints:
1. Always be helpful
...
add_constraint(constraint)[source]#

Add a constraint to guide the agent’s behavior.

Parameters:

constraint (str) – The constraint to be added to the list.

Return type:

None

Example

>>> generator = PromptGenerator()
>>> generator.add_constraint("Always be helpful")
>>> print(generator.constraints[0])
Always be helpful
add_tool(tool)[source]#

Add a tool/command that the agent can use.

Parameters:

tool (BaseTool) – The tool to be added to the commands list.

Return type:

None

Example

>>> from langchain_core.tools import BaseTool
>>> class MyTool(BaseTool):
...     name = "my_tool"
...     description = "A tool for testing"
...     def _run(self, query: str) -> str:
...         return "Result"
>>> generator = PromptGenerator()
>>> generator.add_tool(MyTool())
>>> print(len(generator.commands))
1
add_resource(resource)[source]#

Add a resource that is available to the agent.

Parameters:

resource (str) – The resource to be added to the resources list.

Return type:

None

Example

>>> generator = PromptGenerator()
>>> generator.add_resource("Internet access")
>>> print(generator.resources[0])
Internet access
add_performance_evaluation(evaluation)[source]#

Add a performance evaluation criterion for the agent.

Parameters:

evaluation (str) – The evaluation criterion to be added to the list.

Return type:

None

Example

>>> generator = PromptGenerator()
>>> generator.add_performance_evaluation("Be efficient")
>>> print(generator.performance_evaluation[0])
Be efficient
generate_prompt_string()[source]#

Generate a complete prompt string for the agent.

This method combines all constraints, commands, resources, and performance evaluations into a single formatted prompt string with the response format.

Returns:

The complete prompt string for the agent.

Return type:

str

Example

>>> generator = PromptGenerator()
>>> generator.add_constraint("Always be helpful")
>>> prompt = generator.generate_prompt_string()
>>> print(prompt)
Constraints:
1. Always be helpful
...
sherpa_ai.prompt_generator.get_prompt(tools)[source]#

Generate a complete prompt string with predefined constraints and evaluations.

This function creates a PromptGenerator instance and populates it with standard constraints, tools, resources, and performance evaluations. It then generates and returns a complete prompt string.

Parameters:

tools (List[BaseTool]) – List of tools/commands to include in the prompt.

Returns:

A complete prompt string with all components.

Return type:

str

Example

>>> from langchain_core.tools import BaseTool
>>> class MyTool(BaseTool):
...     name = "my_tool"
...     description = "A tool for testing"
...     args = {"query": "string"}
...     def _run(self, query: str) -> str:
...         return "Result"
>>> tools = [MyTool()]
>>> prompt = get_prompt(tools)
>>> print("Constraints:" in prompt)
True
>>> print("Commands:" in prompt)
True

sherpa_ai.reflection module#

class sherpa_ai.reflection.Reflection(llm, tools, action_list=[])[source]#

Bases: object

Class for performing reflection on AI actions.

This class provides functionality to evaluate and refine AI actions based on feedback and previous actions. It includes methods for creating message history, evaluating action outcomes, and updating the AI’s response format.

llm#

The language model to use for evaluation.

Type:

BaseLanguageModel

tools#

The tools available to the AI.

Type:

List[BaseTool]

action_list#

The list of previous actions.

Type:

List

Example

>>> from langchain_core.language_models import BaseLanguageModel
>>> from langchain_core.tools import BaseTool
>>> from sherpa_ai.reflection import Reflection
>>> llm = ChatOpenAI(model="gpt-4o-mini")
>>> tools = [MyTool()]
>>> reflection = Reflection(llm, tools)
>>> reflection.evaluate_action("action", "reply", "task", "previous_message")
"new_reply"
create_message_history(messages, max_token=2000)[source]#

Create a message history from a list of messages.

This method takes a list of BaseMessage objects and returns a string representing the message history. It reverses the order of the messages and stops adding tokens when the maximum token limit is reached.

Parameters:
  • messages (List[BaseMessage]) – The list of messages to create a history from.

  • max_token (int, optional) – The maximum number of tokens to include in the history. Defaults to 2000.

Returns:

The message history as a string.

Return type:

str

Example

>>> from langchain_core.messages import HumanMessage, AIMessage
>>> messages = [HumanMessage(content="Hello"), AIMessage(content="Hi")]
>>> reflection = Reflection(llm, [])
>>> message_history = reflection.create_message_history(messages)
>>> print(message_history)
evaluate_action(action, assistant_reply, task, previous_message)[source]#

Evaluate the action taken by the AI.

This method evaluates the action taken by the AI based on the feedback and previous actions. It updates the AI’s response format if necessary.

Parameters:
  • action (str) – The action taken by the AI.

  • assistant_reply (str) – The reply from the AI.

  • task (str) – The task to solve.

  • previous_message (List[BaseMessage]) – The previous messages.

Returns:

The new reply from the AI.

Return type:

str

Example

>>> from langchain_core.messages import HumanMessage, AIMessage
>>> messages = [HumanMessage(content="Hello"), AIMessage(content="Hi")]
>>> reflection = Reflection(llm, [])
>>> new_reply = reflection.evaluate_action("action", "reply", "task", messages)
>>> print(new_reply)

sherpa_ai.tools module#

sherpa_ai.tools.get_tools(memory, config)[source]#

Factory function to create and configure a set of tools for the agent.

This function creates and returns a list of tools that the agent can use, including search tools and user input handling. The tools are configured based on the provided memory and configuration parameters.

Parameters:
  • memory – The memory component for tools that require memory access.

  • config – Configuration object containing tool settings.

Returns:

A list of configured tool instances.

Return type:

List[BaseTool]

Example

>>> from sherpa_ai.tools import get_tools
>>> tools = get_tools(memory=memory, config=config)
>>> for tool in tools:
...     print(tool.name)
UserInput
Search
class sherpa_ai.tools.SearchArxivTool(**kwargs)[source]#

Bases: BaseTool

Tool for searching and retrieving scientific papers from Arxiv.

This class provides functionality to search Arxiv’s database for scientific papers and retrieve their titles, summaries, and IDs. It’s particularly useful for research-related queries and academic information gathering.

This class inherits from BaseTool and provides methods to:
  • Search Arxiv’s database

  • Parse and format search results

  • Return paper metadata and summaries

name#

The name of the tool, set to “Arxiv Search”.

Type:

str

description#

A description of when to use this tool.

Type:

str

Example

>>> from sherpa_ai.tools import SearchArxivTool
>>> tool = SearchArxivTool()
>>> result = tool._run("machine learning")
>>> print(result)
Title: Example Paper
Summary: This paper discusses...
name: str#

The unique name of the tool that clearly communicates its purpose.

description: str#

Used to tell the model how/when/why to use the tool.

You can provide few-shot examples as a part of the description.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.tools.SearchTool(**kwargs)[source]#

Bases: BaseTool

Tool for performing internet searches using the Google Serper API.

This class provides functionality to search the internet for information using the Google Serper API. It supports domain-specific searches and can return both formatted results and resources for citation.

This class inherits from BaseTool and provides methods to:
  • Perform general internet searches

  • Execute domain-specific searches

  • Parse and format search results

  • Handle knowledge graph and answer box results

name#

The name of the tool, set to “Search”.

Type:

str

config#

Configuration object for search settings.

Type:

AgentConfig

top_k#

Number of results to return, defaults to 10.

Type:

int

description#

A description of when to use this tool.

Type:

str

Example

>>> from sherpa_ai.tools import SearchTool
>>> tool = SearchTool(config=config)
>>> result = tool._run("python programming")
>>> print(result)
Answer: Python is a high-level programming language...
name: str#

The unique name of the tool that clearly communicates its purpose.

config: AgentConfig#
top_k: int#
description: str#

Used to tell the model how/when/why to use the tool.

You can provide few-shot examples as a part of the description.

Formulate a site-specific search query.

Parameters:
  • query (str) – The base search query.

  • site (str) – The site to restrict the search to.

Returns:

The formatted site-specific search query.

Return type:

str

Example

>>> tool = SearchTool()
>>> query = tool.formulate_site_search("python", "python.org")
>>> print(query)
python site:python.org
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.tools.ContextTool(**kwargs)[source]#

Bases: BaseTool

Tool for accessing internal technical documentation.

This class provides functionality to search and retrieve information from internal technical documentation for various AI-related projects. It uses a vector store retriever to find relevant documentation based on queries.

This class inherits from BaseTool and provides methods to:
  • Search internal documentation

  • Retrieve relevant context

  • Format search results

name#

The name of the tool, set to “Context”.

Type:

str

description#

A description of when to use this tool.

Type:

str

memory#

The vector store retriever for searching documentation.

Type:

VectorStoreRetriever

Example

>>> from sherpa_ai.tools import ContextTool
>>> tool = ContextTool(memory=memory)
>>> result = tool._run("How to use LangChain?")
>>> print(result)
LangChain is a framework for developing applications...
name: str#

The unique name of the tool that clearly communicates its purpose.

description: str#

Used to tell the model how/when/why to use the tool.

You can provide few-shot examples as a part of the description.

memory: VectorStoreRetriever#
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.tools.UserInputTool(**kwargs)[source]#

Bases: BaseTool

Tool for handling user input in the agent system.

This class provides functionality to process and handle user input within the agent system. It serves as an interface for receiving and processing user queries.

This class inherits from BaseTool and provides methods to:
  • Process user input

  • Return user queries

name#

The name of the tool, set to “UserInput”.

Type:

str

description#

A description of when to use this tool.

Type:

str

Example

>>> from sherpa_ai.tools import UserInputTool
>>> tool = UserInputTool()
>>> result = tool._run("What is Python?")
>>> print(result)
What is Python?
name: str#

The unique name of the tool that clearly communicates its purpose.

description: str#

Used to tell the model how/when/why to use the tool.

You can provide few-shot examples as a part of the description.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sherpa_ai.tools.LinkScraperTool(**kwargs)[source]#

Bases: BaseTool

Tool for extracting content from web links.

This class provides functionality to scrape and extract content from web links. It’s useful for retrieving information from specific URLs when needed.

This class inherits from BaseTool and provides methods to:
  • Scrape web content

  • Extract information from links

  • Process and format scraped content

name#

The name of the tool, set to “Link Scraper”.

Type:

str

description#

A description of when to use this tool.

Type:

str

Example

>>> from sherpa_ai.tools import LinkScraperTool
>>> tool = LinkScraperTool()
>>> result = tool._run("https://example.com", llm=llm)
>>> print(result)
Content from the webpage...
name: str#

The unique name of the tool that clearly communicates its purpose.

description: str#

Used to tell the model how/when/why to use the tool.

You can provide few-shot examples as a part of the description.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'ignore', 'protected_namespaces': ()}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sherpa_ai.utils module#

sherpa_ai.utils.load_files(files)[source]#

Load files from a list of file paths.

Parameters:

files (List[str]) – A list of file paths to load.

Returns:

A list of loaded documents.

Return type:

List[Document]

Example

>>> from langchain_core.documents import Document
>>> from sherpa_ai.utils import load_files
>>> files = ["file1.pdf", "file2.md"]
>>> documents = load_files(files)
>>> print(len(documents))
2

Extract links from a string.

Parameters:

text (str) – The input string to extract links from.

Returns:

A list of dictionaries containing the extracted links and their base URLs.

Return type:

List[dict]

Example

>>> from sherpa_ai.utils import get_links_from_string
>>> text = "Check out this link: <https://www.example.com>"
>>> links = get_links_from_string(text)
>>> print(links)
[{'url': 'https://www.example.com', 'base_url': 'https://www.example.com'}]
sherpa_ai.utils.get_base_url(link)[source]#

Get the base URL from a link.

Parameters:

link (str) – The input link.

Returns:

The base URL of the link.

Return type:

str

Example

>>> from sherpa_ai.utils import get_base_url
>>> link = "https://www.example.com/path/to/resource"
>>> base_url = get_base_url(link)
>>> print(base_url)
"https://www.example.com"

Get links from a Slack client conversation.

Parameters:

data (list) – The input data containing the conversation.

Returns:

A list of dictionaries containing the extracted links and their base URLs.

Return type:

list

Example

>>> from sherpa_ai.utils import get_link_from_slack_client_conversation
>>> data = [{"blocks": [{"elements": [{"elements": [{"type": "link", "url": "https://www.example.com"}]}]}]}]
>>> links = get_link_from_slack_client_conversation(data)
>>> print(links)
[{'url': 'https://www.example.com', 'base_url': 'https://www.example.com'}]
sherpa_ai.utils.scrape_with_url(url)[source]#

Scrape a URL and return the text.

Parameters:

url (str) – The URL to scrape.

Returns:

A dictionary containing the scraped text and the status code.

Return type:

dict

Example

>>> from sherpa_ai.utils import scrape_with_url
>>> url = "https://www.example.com"
>>> result = scrape_with_url(url)
>>> print(result)
{'data': 'Example text', 'status': 200}

Rewrite link references in a text.

Parameters:
  • data (any) – The input data containing the links.

  • question (str) – The question to rewrite the links for.

Returns:

The rewritten text with link references.

Return type:

str

Example

>>> from sherpa_ai.utils import rewrite_link_references
>>> data = [{"link": "https://www.example.com"}]
>>> question = "What is the link?"
>>> result = rewrite_link_references(data, question)
>>> print(result)
"What is the link? [1] link: https://www.example.com"
sherpa_ai.utils.count_string_tokens(string, model_name)[source]#

Returns the number of tokens in a text string.

Parameters:
  • string (str) – The text string.

  • model_name (str) – The name of the encoding to use. (e.g., “gpt-3.5-turbo”)

Returns:

The number of tokens in the text string.

Return type:

int

sherpa_ai.utils.chunk_and_summarize(text_data, question, link, llm)[source]#

Chunk and summarize text.

Parameters:
  • text_data (str) – The text to chunk and summarize.

  • question (str) – The question to answer.

  • link (str) – The link to the text.

  • llm (BaseLanguageModel) – The language model to use for summarization.

Returns:

The summarized text.

Return type:

str

Example

>>> from sherpa_ai.utils import chunk_and_summarize
>>> text_data = "This is a test text."
>>> question = "What is the text?"
>>> link = "https://www.example.com"
>>> llm = ChatOpenAI(model="gpt-3.5-turbo")
>>> result = chunk_and_summarize(text_data, question, link, llm)
>>> print(result)
"This is a test text."
sherpa_ai.utils.chunk_and_summarize_file(text_data, question, file_name, file_format, llm, title=None)[source]#

Chunk and summarize a file.

Parameters:
  • text_data (str) – The text to chunk and summarize.

  • question (str) – The question to answer.

  • file_name (str) – The name of the file.

  • file_format (str) – The format of the file.

  • llm (BaseLanguageModel) – The language model to use for summarization.

  • title (str) – The title of the file.

Returns:

The summarized text.

Return type:

str

Example

>>> from sherpa_ai.utils import chunk_and_summarize_file
>>> text_data = "This is a test text."
>>> question = "What is the text?"
>>> file_name = "test.txt"
>>> file_format = "txt"
>>> llm = ChatOpenAI(model="gpt-3.5-turbo")
>>> result = chunk_and_summarize_file(text_data, question, file_name, file_format, llm)
>>> print(result)
"This is a test text."
sherpa_ai.utils.question_with_file_reconstructor(data, file_name, title, file_format, question)[source]#

Reconstruct a question with a file reference.

Parameters:
  • data (str) – The data to reconstruct the question with.

  • file_name (str) – The name of the file.

  • title (str) – The title of the file.

  • file_format (str) – The format of the file.

  • question (str) – The question to reconstruct.

Returns:

The reconstructed question.

Return type:

str

Example

>>> from sherpa_ai.utils import question_with_file_reconstructor
>>> data = "This is a test text."
>>> file_name = "test.txt"
>>> title = "Test Title"
>>> file_format = "txt"
>>> question = "What is the text?"
>>> result = question_with_file_reconstructor(data, file_name, title, file_format, question)
>>> print(result)
"What is the text? [1] link: https://www.example.com"
sherpa_ai.utils.log_formatter(logs)[source]#

Format logs for verbose output.

Parameters:

logs (list) – The logs to format.

Returns:

The formatted logs.

Return type:

str

sherpa_ai.utils.show_commands_only(logs)[source]#

Modified version of log_formatter that only shows commands.

Parameters:

logs (list) – The logs to format.

Returns:

The formatted logs.

Return type:

str

sherpa_ai.utils.extract_text_from_pdf(pdf_path)[source]#
sherpa_ai.utils.extract_urls(text)[source]#

Extract URLs from a text.

Parameters:

text (str) – The text to extract URLs from.

Returns:

A list of URLs.

Return type:

list

sherpa_ai.utils.check_url(url)[source]#

Check if a URL is valid.

Parameters:

url (str) – The URL to check.

Returns:

True if the URL is valid, False otherwise.

Return type:

bool

sherpa_ai.utils.extract_numbers_from_text(text)[source]#

Extract numbers from a text.

Parameters:

text (str) – The text to extract numbers from.

Returns:

A list of numbers.

Return type:

list

sherpa_ai.utils.word_to_float(text)[source]#

Convert a word to a float.

Parameters:

text (str) – The text to convert to a float.

Returns:

A dictionary containing the success and data.

’success’ (bool): True if the conversion was successful, False otherwise. ‘data’ (float): The converted float value if ‘success’ is True. ‘message’ (str): An error message if ‘success’ is False.

Return type:

dict

Example

>>> from sherpa_ai.utils import word_to_float
>>> text = "one"
>>> result = word_to_float(text)
>>> print(result)
{'success': True, 'data': 1}
sherpa_ai.utils.extract_numeric_entities(text, entity_types=['DATE', 'CARDINAL', 'QUANTITY', 'MONEY'])[source]#

Extract numeric entities from a text.

Parameters:
  • text (str) – The text to extract numeric entities from.

  • entity_types (List[str]) – A list of spaCy entity types to consider for extraction.

Returns:

A list of numeric entities.

Return type:

list

sherpa_ai.utils.combined_number_extractor(text)[source]#

Extract unique numeric values from a text by combining results from two different extraction methods.

Parameters:

text (str) – The text to extract numeric values from.

Returns:

A list of unique numeric values.

Return type:

list

sherpa_ai.utils.verify_numbers_against_source(text_to_test, source_text)[source]#

Verify that all numbers in text_to_test exist in source_text.

Parameters:
  • text_to_test (Optional[str]) – The text to test.

  • source_text (Optional[str]) – The source text.

Returns:

A tuple containing a boolean and a message.
  • boolean: True if all numbers in text_to_test exist in source_text, False otherwise.

  • message: A message indicating whether the numbers in text_to_test exist in source_text.

Return type:

tuple

sherpa_ai.utils.check_if_number_exist(result, source)[source]#

Check if a number exists in a text.

Parameters:
  • result (str) – The text to check.

  • source (str) – The source text.

Returns:

A tuple containing a boolean and a message.
  • boolean: True if the number exists in the source text, False otherwise.

  • message: A message indicating whether the number exists in the source text.

Return type:

tuple

sherpa_ai.utils.string_comparison_with_jaccard_and_levenshtein(word1, word2, levenshtein_constant)[source]#

Calculate a combined similarity metric using Jaccard similarity and normalized Levenshtein distance.

Parameters:
  • word1 (str) – First input string.

  • word2 (str) – Second input string.

  • levenshtein_constant (float) – Weight for the Levenshtein distance in the combined metric.

Returns: float: Combined similarity metric. Args: - word1 (str): First input string. - word2 (str): Second input string. - levenshtein_constant (float): Weight for the Levenshtein distance in the combined metric.

Returns: float: Combined similarity metric.

sherpa_ai.utils.extract_entities(text)[source]#

Extract entities of specific types NORP (Nationalities or Religious or Political Groups) ORG (Organization) GPE (Geopolitical Entity) LOC (Location) using spaCy.

Parameters:

text (str) – The text to extract entities from.

Returns:

A list of extracted entities.

Return type:

list

sherpa_ai.utils.json_from_text(text)[source]#

Extract and parse JSON data from a text.

Parameters:

text (str) – Input text containing JSON data.

Returns:

Parsed JSON data.

Return type:

dict

sherpa_ai.utils.text_similarity_by_llm(llm, source_entity, source, result, user_id=None, team_id=None)[source]#

Check if entities from a question are mentioned in some form inside the answer using a language model.

Parameters:
  • source_entity (List[str]) – List of entities from the question.

  • source (str) – Question text.

  • result (str) – Answer text.

  • user_id (str) – User ID (optional).

  • team_id (str) – Team ID (optional).

Returns:

Result of the check containing ‘entity_exist’ and ‘messages’.

Return type:

dict

sherpa_ai.utils.text_similarity_by_metrics(check_entity, source_entity)[source]#

Check entity similarity based on Jaccard and Levenshtein metrics.

Parameters:
  • check_entity (List[str]) – List of entities to check.

  • source_entity (List[str]) – List of reference entities.

Returns:

Result of the check containing ‘entity_exist’ and ‘messages’.

Return type:

dict

sherpa_ai.utils.text_similarity(check_entity, source_entity)[source]#

Check if entities from a reference list are present in another list.

Parameters:
  • check_entity ([str]) – List of entities to check.

  • source_entity ([str]) – List of reference entities.

Returns:

Result of the check containing ‘entity_exist’ and ‘messages’.

Return type:

dict

sherpa_ai.utils.file_text_splitter(data, meta_data)[source]#

Split a text into chunks of a given size.

Parameters:
  • data (str) – The text to split.

  • meta_data (dict) – The metadata to include in the chunks.

Returns:

A dictionary containing the texts and meta_datas.

Return type:

dict

Return type:

List[str]

sherpa_ai.utils.is_coroutine_function(func)[source]#

Check if a function is a coroutine function.

Parameters:

func (Any) – The function to check.

Returns:

True if the function is a coroutine function, False otherwise.

Return type:

bool

Module contents#