Librarian of Tools

A sophisticated query embedding model designed to act as an intelligent "librarian" for discovering and retrieving the most relevant tools and APIs based on natural language queries.

🎯 Purpose

This model serves as a Librarian of Tools - an AI system that understands user intentions and finds the most appropriate tools, APIs, or functions to accomplish their tasks. It's particularly effective for:

  • API Discovery: Finding relevant rapid APIs from large collections
  • Iterative Search: Progressive tool discovery with residual-based refinement

πŸ—οΈ Model Architecture

  • Base Model: ToolBench/ToolBench_IR_bert_based_uncased
  • Architecture: Query embedding with scale prediction
  • Special Features:
    • Query-Specific Adaptation: Tailors embeddings to individual query characteristics
    • Balanced Magnitude Handling: Maintains appropriate scaling for retrieval tasks
    • Residual-Based Iteration: Supports iterative search for comprehensive tool discovery

πŸŽ“ Training Strategy

  • Training Approach: Dynamic direction-focused with AdamW optimizer
  • Loss Function: Combined MSE, direction loss, and magnitude loss
  • Scale Prediction: Uses softplus + 1 activation for 1 + scale factors
  • Dataset: Trained on query-(sum vector of relevant API embeddings) pairs from ToolBench(https://github.com/OpenBMB/ToolBench)

πŸš€ Usage

Basic Usage

from transformers import AutoTokenizer, AutoModel
import torch

# Load the Librarian model
model = AutoModel.from_pretrained("jhleepidl/librarian")
tokenizer = AutoTokenizer.from_pretrained("jhleepidl/librarian")

# Ask the librarian to find tools
query = "How to send an email using Python?"
inputs = tokenizer(query, return_tensors="pt", truncation=True, max_length=256)

# Get the embedding for tool retrieval
with torch.no_grad():
    embedding = model(**inputs)
    # The output is optimized for finding relevant tools/APIs

Advanced Usage with Similarity Search

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Get embeddings for multiple queries
queries = [
    "Send email with attachment",
    "Download file from URL",
    "Parse JSON data",
    "Make HTTP POST request"
]

embeddings = []
for query in queries:
    inputs = tokenizer(query, return_tensors="pt", truncation=True, max_length=256)
    with torch.no_grad():
        embedding = model(**inputs).cpu().numpy()
        embeddings.append(embedding.flatten())

# Find similar queries
similarity_matrix = cosine_similarity(embeddings)
print("Query similarity matrix:")
print(similarity_matrix)

Iterative Search for Comprehensive Tool Discovery

The model supports iterative search, which progressively discovers tools by removing found APIs from the query representation and continuing the search:

import torch
import torch.nn.functional as F
import numpy as np

class LibrarianSearch:
    def __init__(self, model, tokenizer, vector_db_index, documents, threshold=0.5):
        self.model = model
        self.tokenizer = tokenizer
        self.index = vector_db_index
        self.documents = documents
        self.threshold = threshold
    
    def get_query_embedding(self, query, normalize=False):
        """Get query embedding using the Librarian model"""
        inputs = self.tokenizer(
            query, 
            return_tensors="pt", 
            truncation=True, 
            max_length=256
        )
        
        with torch.no_grad():
            embedding = self.model(**inputs)
            embedding = embedding.cpu().numpy()
            
        if normalize:
            embedding = embedding / np.linalg.norm(embedding)
        return embedding.flatten()
    
    def iterative_greedy_search(self, query, remove_duplicates=True):
        """
        Perform iterative greedy search to find multiple relevant tools
        
        Args:
            query: The search query
            remove_duplicates: Whether to avoid returning duplicate APIs
            
        Returns:
            List of found APIs with scores
        """
        # Get initial query embedding (unnormalized for residual calculation)
        query_embedding_unnorm = self.get_query_embedding(query, normalize=False)
        query_embedding_norm = self.get_query_embedding(query, normalize=True)
        
        found_apis = []
        current_query_unnorm = query_embedding_unnorm.copy()
        current_query_norm = query_embedding_norm.copy()
        found_api_keys = set() if remove_duplicates else None
        
        while True:
            # Search for the best matching API
            scores, indices = self.index.search(
                current_query_norm.reshape(1, -1), 1
            )
            
            if indices[0][0] == -1 or scores[0][0] < self.threshold:
                break
            
            # Get the found API
            idx = indices[0][0]
            doc = self.documents[idx]
            api_key = f"{doc['metadata']['tool_name']}_{doc['metadata']['api_name']}"
            
            # Check for duplicates
            if remove_duplicates and api_key in found_api_keys:
                # Calculate residual and continue
                doc_embedding = self.get_embedding_by_index(idx)
                residual = current_query_unnorm - doc_embedding
                residual_norm = np.linalg.norm(residual)
                
                if residual_norm > np.linalg.norm(current_query_unnorm):
                    break
                
                current_query_unnorm = residual
                if residual_norm > 0:
                    current_query_norm = residual / residual_norm
                else:
                    break
                continue
            
            # Add the found API
            found_apis.append({
                'tool_name': doc['metadata']['tool_name'],
                'api_name': doc['metadata']['api_name'],
                'score': float(scores[0][0])
            })
            
            if remove_duplicates:
                found_api_keys.add(api_key)
            
            # Calculate residual and update query
            doc_embedding = self.get_embedding_by_index(idx)
            residual = current_query_unnorm - doc_embedding
            residual_norm = np.linalg.norm(residual)
            
            if residual_norm > np.linalg.norm(current_query_unnorm):
                break
            
            current_query_unnorm = residual
            if residual_norm > 0:
                current_query_norm = residual / residual_norm
            else:
                break
        
        return found_apis
    
    def get_embedding_by_index(self, idx):
        """Get embedding for a specific document index"""
        # Implementation depends on your vector database setup
        return self.index.reconstruct(int(idx))

# Usage example
librarian_search = LibrarianSearch(model, tokenizer, vector_db_index, documents)

# Find multiple tools for a complex query
query = "I need to send emails, process images, and analyze data"
found_tools = librarian_search.iterative_greedy_search(query)

print("Found tools:")
for tool in found_tools:
    print(f"- {tool['tool_name']}.{tool['api_name']} (score: {tool['score']:.3f})")

Beam Search for Optimal Tool Combinations

For even more sophisticated tool discovery, the model supports beam search:

def beam_search_iterative(self, query, beam_size=5):
    """
    Perform beam search to find optimal combinations of tools
    
    Args:
        query: The search query
        beam_size: Number of beams to maintain
        
    Returns:
        Best combination of APIs found
    """
    # Implementation similar to iterative_greedy_search but maintains
    # multiple candidate paths (beams) and selects the best combination
    # This is useful for complex queries requiring multiple tools
    
    # ... beam search implementation ...
    pass

πŸ“š Citation

If you use this model in your research or applications, please cite:

@misc{librarian_of_tools,
  title={Librarian of Tools},
  author={jhleepidl},
  year={2025},
  url={https://github.com/jhleepidl/librarian}
}

πŸ“„ License

This model is released under the MIT License, making it suitable for both research and commercial applications.


The Librarian of Tools - Your intelligent assistant for discovering the right tools for any task! πŸ› οΈπŸ“š

Downloads last month
6
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support