sherpa_ai.database package

In This Page:

sherpa_ai.database package#

Overview#

The database package provides database interaction capabilities for Sherpa AI, enabling persistence, tracking, and analysis of agent interactions and user data.

Key Components

  • UserUsageTracker: Tracks and records user interactions with Sherpa AI

  • Storage Management: Tools for storing and retrieving data

  • Usage Analytics: Capabilities for analyzing usage patterns

Example Usage#

from sherpa_ai.database.user_usage_tracker import UserUsageTracker

# Initialize a usage tracker
tracker = UserUsageTracker(user_id="user123")

# Record a user interaction
tracker.track_interaction(
    query="What is machine learning?",
    response_length=1250,
    tokens_used=350,
    model_name="gpt-4"
)

# Get usage statistics
monthly_stats = tracker.get_monthly_usage()
print(f"Tokens used this month: {monthly_stats['total_tokens']}")

Submodules#

Module

Description

sherpa_ai.database.user_usage_tracker

Implements tracking and analytics for user interactions with the system.

sherpa_ai.database.user_usage_tracker module#

User usage tracking module for Sherpa AI.

class sherpa_ai.database.user_usage_tracker.UsageTracker(**kwargs)[source]#

Bases: Base

SQLAlchemy model for tracking LLM token usage.

id#
user_id#
cost#
model_name#
session_id#
agent_name#
timestamp#
reset_timestamp#
reminded_timestamp#
usage_metadata_json#
class sherpa_ai.database.user_usage_tracker.Whitelist(**kwargs)[source]#

Bases: Base

SQLAlchemy model for user whitelist.

id#
user_id#
class sherpa_ai.database.user_usage_tracker.UserUsageTracker(db_name='token_counter.db', db_url='sqlite:///token_counter.db', bucket_name=None, s3_file_key=None, log_to_s3=None, log_to_file=None, log_file_path=None, pricing_manager=None, engine=None, session=None, verbose_logger=None)[source]#

Bases: object

Clean, minimal usage tracker with essential functionality only.

add_usage(user_id, input_tokens=0, output_tokens=0, usage_metadata=None, model_name=None, session_id=None, agent_name=None, cost=None, check_limits=True, send_reminder=True, reset_timestamp=False, reminded_timestamp=False)[source]#

Unified method to add usage data with optional limit checking.

Parameters:
  • user_id (str) – ID of the user.

  • input_tokens (int) – Number of input tokens (used if no usage_metadata).

  • output_tokens (int) – Number of output tokens (used if no usage_metadata).

  • usage_metadata (Optional[Dict[str, Any]]) – Usage metadata from callback.

  • model_name (Optional[str]) – Name of the model used.

  • session_id (Optional[str]) – ID of the session.

  • agent_name (Optional[str]) – Name of the agent.

  • cost (Optional[float]) – Cost in USD. If None, will be calculated.

  • check_limits (bool) – Whether to check if user has exceeded limits.

  • send_reminder (bool) – Whether to send reminder if approaching limits.

  • reset_timestamp (bool) – Whether to reset the timestamp.

  • reminded_timestamp (bool) – Whether to mark as reminded.

Returns:

Usage information if check_limits=True, None otherwise.

Return type:

dict

add_to_whitelist(user_id)[source]#

Add a user to the whitelist.

get_all_whitelisted_ids()[source]#

Get all whitelisted user IDs (backward compatibility).

Return type:

List[str]

get_whitelist_by_user_id(user_id)[source]#

Get whitelist information for a specific user (backward compatibility).

Return type:

List[Dict[str, Any]]

get_all_data()[source]#

Get all usage data.

Return type:

List[Dict[str, Any]]

parse_usage_metadata(usage_metadata_json)[source]#

Parse usage metadata JSON string into structured data.

Parameters:

usage_metadata_json (str) – JSON string of usage metadata.

Returns:

Parsed usage metadata or empty dict if parsing fails.

Return type:

Dict[str, Any]

close_connection()[source]#

Close the database connection and cleanup helpers.

get_tokens_from_usage_metadata(usage_metadata)[source]#

Extract token counts from usage metadata.

Return type:

Dict[str, int]

get_token_details_from_usage_metadata(usage_metadata)[source]#

Extract detailed token information from usage metadata.

Return type:

Dict[str, Any]

upload_to_s3()[source]#

Upload database to S3.

classmethod download_from_s3(db_name=None, db_url=None, **kwargs)[source]#

Download database from S3 and return UserUsageTracker instance.

download_from_s3_instance()[source]#

Download database from S3 using instance configuration.

get_user_cost(user_id)[source]#

Get total cost for a user.

Return type:

float

get_session_cost(session_id)[source]#

Get total cost for a session.

Return type:

float

get_agent_cost(agent_name)[source]#

Get total cost for an agent.

Return type:

float

get_cost_summary(user_id=None)[source]#

Get cost summary statistics.

Return type:

dict

estimate_cost(model_name, input_tokens, output_tokens)[source]#

Estimate cost for a model call.

Return type:

float

check_cost_limit(user_id, limit)[source]#

Check if user has exceeded cost limit.

Return type:

dict

is_in_whitelist(user_id)[source]#

Check if user is in whitelist.

Return type:

bool

check_usage(user_id, input_tokens, output_tokens, usage_metadata=None)[source]#

Check usage limits.

Return type:

dict

get_usage_metadata_statistics(user_id=None)[source]#

Get detailed usage statistics from usage metadata.

Return type:

Dict[str, Any]

Module contents#

Database module for Sherpa AI.

This module provides database functionality for tracking user usage and managing whitelists. It exports the UserUsageTracker class which handles token usage tracking on a per-user basis.

Example

>>> from sherpa_ai.database import UserUsageTracker
>>> tracker = UserUsageTracker()
>>> tracker.check_usage("user123", 100)
{'token-left': 900, 'can_execute': True, 'message': '', 'time_left': '23 hours : 59 min : 59 sec'}
class sherpa_ai.database.UserUsageTracker(db_name='token_counter.db', db_url='sqlite:///token_counter.db', bucket_name=None, s3_file_key=None, log_to_s3=None, log_to_file=None, log_file_path=None, pricing_manager=None, engine=None, session=None, verbose_logger=None)[source]#

Bases: object

Clean, minimal usage tracker with essential functionality only.

add_usage(user_id, input_tokens=0, output_tokens=0, usage_metadata=None, model_name=None, session_id=None, agent_name=None, cost=None, check_limits=True, send_reminder=True, reset_timestamp=False, reminded_timestamp=False)[source]#

Unified method to add usage data with optional limit checking.

Parameters:
  • user_id (str) – ID of the user.

  • input_tokens (int) – Number of input tokens (used if no usage_metadata).

  • output_tokens (int) – Number of output tokens (used if no usage_metadata).

  • usage_metadata (Optional[Dict[str, Any]]) – Usage metadata from callback.

  • model_name (Optional[str]) – Name of the model used.

  • session_id (Optional[str]) – ID of the session.

  • agent_name (Optional[str]) – Name of the agent.

  • cost (Optional[float]) – Cost in USD. If None, will be calculated.

  • check_limits (bool) – Whether to check if user has exceeded limits.

  • send_reminder (bool) – Whether to send reminder if approaching limits.

  • reset_timestamp (bool) – Whether to reset the timestamp.

  • reminded_timestamp (bool) – Whether to mark as reminded.

Returns:

Usage information if check_limits=True, None otherwise.

Return type:

dict

add_to_whitelist(user_id)[source]#

Add a user to the whitelist.

get_all_whitelisted_ids()[source]#

Get all whitelisted user IDs (backward compatibility).

Return type:

List[str]

get_whitelist_by_user_id(user_id)[source]#

Get whitelist information for a specific user (backward compatibility).

Return type:

List[Dict[str, Any]]

get_all_data()[source]#

Get all usage data.

Return type:

List[Dict[str, Any]]

parse_usage_metadata(usage_metadata_json)[source]#

Parse usage metadata JSON string into structured data.

Parameters:

usage_metadata_json (str) – JSON string of usage metadata.

Returns:

Parsed usage metadata or empty dict if parsing fails.

Return type:

Dict[str, Any]

close_connection()[source]#

Close the database connection and cleanup helpers.

get_tokens_from_usage_metadata(usage_metadata)[source]#

Extract token counts from usage metadata.

Return type:

Dict[str, int]

get_token_details_from_usage_metadata(usage_metadata)[source]#

Extract detailed token information from usage metadata.

Return type:

Dict[str, Any]

upload_to_s3()[source]#

Upload database to S3.

classmethod download_from_s3(db_name=None, db_url=None, **kwargs)[source]#

Download database from S3 and return UserUsageTracker instance.

download_from_s3_instance()[source]#

Download database from S3 using instance configuration.

get_user_cost(user_id)[source]#

Get total cost for a user.

Return type:

float

get_session_cost(session_id)[source]#

Get total cost for a session.

Return type:

float

get_agent_cost(agent_name)[source]#

Get total cost for an agent.

Return type:

float

get_cost_summary(user_id=None)[source]#

Get cost summary statistics.

Return type:

dict

estimate_cost(model_name, input_tokens, output_tokens)[source]#

Estimate cost for a model call.

Return type:

float

check_cost_limit(user_id, limit)[source]#

Check if user has exceeded cost limit.

Return type:

dict

is_in_whitelist(user_id)[source]#

Check if user is in whitelist.

Return type:

bool

check_usage(user_id, input_tokens, output_tokens, usage_metadata=None)[source]#

Check usage limits.

Return type:

dict

get_usage_metadata_statistics(user_id=None)[source]#

Get detailed usage statistics from usage metadata.

Return type:

Dict[str, Any]